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NUCLEIC ACm-ASSOCIATED PROTEINS 

TECHNICAL FIELD 
The invention relates to novel nucleic acids, nucleic acid-associated proteins encoded by 
these nucleic acids, and to the use of these nucleic acids and proteins in the diagnosis, treatment, and 
prevention of cell proliferative, neurological, developmental, and autoimmune/inflammatory 
disorders, and infections. The invention also relates to the assessment of the effects of exogenous 
compounds on the expression of nucleic acids and nucleic acid-associated proteins. 

BACKGROUND OF THE INVENTION 

Multicellular organisms are comprised of diverse cell types that differ dramatically both in 
stracture and function. The identity of a cell is determined by its characteristic pattem of gene 
expression, and different cell types express overlapping but distinctive sets of genes throughout 
development. Spatial and temporal regulation of gene expression is critical for the control of cell 
proliferation, cell differentiation, apoptosis, and other processes that contribute to organismal 
development. Furthermore, gene expression is regulated in response to extracellular signals that 
mediate cell-cell communication and coordinate the activities of different cell types. Appropriate 
gene regulation also ensures that cells function efficiently by expressing only those genes whose 
functions are required at a given time. 
Transcription Factors 

Transcriptional regulatory proteins are essential for the control of gene expression. Some of 
these proteins function as transcription factors that initiate, activate, repress, or terminate gene 
transcription. Transcription factors generally bind to the promoter, enhancer, and upstream 
regulatory regions of a gene in a sequence-specific manner, although some factors bind regulatory 
elements within or downstream of a gene coding region. Transcription factors may bind to a specific 
region of DNA singly or as a complex with other accessory factors. (Reviewed in Lewin, B. (1990) 
Genes IV , Oxford University Press, New York, NY, and Cell Press, Cambridge, MA, pp. 554-570.) 

The double helix structure and repeated sequences of DNA create topological and chemical 
features which can be recognized by transcription factors. These features are hydrogen bond donor 
and acceptor groups, hydrophobic patches, major and minor grooves, and regular, repeated stretches 
of sequence which induce distinct bends in the helix. Typically, transcription factors recognize 
specific DNA sequence motifs of about 20 nucleotides in length. Multiple, adjacent transcription 
factor-binding motifs may be required for gene regulation. 

Many transcription factors incorporate DNA-binding structural motifs which comprise either 
a helices or 6 sheets that bind to the major groove of DNA. Four well-characterized structural motifs 
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are helix-tum-helix, zinc finger, leucine zipper, and helix-loop-helix. Proteins containing these motifs 
may act alone as monomers, or they may form homo- or heterodimers that interact with DNA. 

The helix-tum-helix motif consists of two a helices connected at a fixed angle by a short 
chain of amino acids. One of the helices binds to the major groove. Helix-tum-helix motifs are 
5 exemplified by the homeobox motif which is present in homeodomain proteins. These proteins are 
critical for specifying the anterior-posterior body axis during development and are conserved 
throughout the animal kingdom. The Antennapedia and Ultrabithorax proteins of Drosophila 
melanogaster are prototypical homeodomain proteins. (Pabo, CO. and R.T. Sauer (1992) Annu. Rev. 
Biochem. 61:1053-1095.) 

10 The zinc finger motif, which binds zinc ions, generally contains tandem repeats of about 30 

amino acids consisting of periodically spaced cysteine and histidine residues. Examples of this 
sequence pattern, designated C2H2 and C3HC4 (*'RING" finger), have been described. (Lewin, 
supra.) Zinc finger proteins each contain an a helix and an antiparallel 6 sheet whose proximity and 
conformation are maintained by the zinc ion. Contact with DNA is made by the arginine preceding 

15 the a helix and by the second, third, and sixth residues of the a helix. Variants of the zinc finger motif 
include poorly defined cysteine-rich motifs which bind zinc or other metal ions. These motifs may 
not contain histidine residues and are generally nonrepetitive. The zinc finger motif may be repeated 
in a tandem array within a protein, such that the a helix of each zinc finger in the protein makes 
contact with the major groove of the DNA double helix. This repeated contact between the protein 

20 and the DNA produces a strong and specific DNA-protein interaction. The strength and specificity of 
the interaction can be regulated by the number of zinc finger motifs within the protein. Though 
originally identified in DNA-binding proteins as regions that interact directly with DNA, zinc fingers 
occur in a variety of proteins that do not bind DNA (Lodish, H. et al. (1995) Molecular Cell Biology . 
Scientific American Books, New York, NY, pp. 447-451). For example, Galcheva-Gargova, Z. et al. 

25 ((1996) Science 272: 1797-1802) have identified zinc finger proteins that interact with various 
cytokine receptors. 

The C2H2-type zinc finger signature motif contains a 28 amino acid sequence, including 2 
conserved Cys and 2 conserved His residues in a C-2-C-12-H-3-H type motif. The motif generally 
occurs in multiple tandem repeats. A cysteine-rich domain including the motif Asp-His-His-Cys 

30 (DHHC-CRD) fias been identified as a distinct subgroup of zinc finger proteins. The DHHC-CRD 
region has been implicated in growth and development. One DHHC-CRD mutant shows defective 
function of Ras, a small membrane-associated OTP-binding protein that regulates cell growth and 
differentiation, while other DHHC-CRD proteins probably function in pathways not involving Ras 
(Bartels, DJ. et al. (1999) Mol. Cell Biol. 19:6775-6787). 

35 Zinc-finger transcription factors are often accompaoied by modular sequence motifs such as 
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the Kmppel-associated box (KRAB) and the SCAN domain. For example, the 
hypoalphalipoproteinemia susceptibility gene ZNF202 encodes a SCAN box and a KRAB domain 
followed by eight C2H2 zinc-finger motifs (Honer, C. et al. (2001) Biochim. Biophys. Acta 
1517:441-448). The SCAN domain is a highly conserved, leucine-rich motif of approximately 60 
5 amino acids found at the amino-temainal end of zinc finger transcription factors. SCAN domains are 
most often linked to C2H2 zinc finger motifs through their carboxyl-terminal end. Biochemical 
binding studies have established the SCAN domain as a selective hetero- and homotypic 
oligomerization donmin. SCAN domain-mediated protein complexes may function to modulate the 
biological function of transcription factors (Schunoiacher, C. et al. (2000) J. Biol. Chem. 275:17173- 
10 17179). 

The KRAB (Kruppel-associated box) domain is a conserved amino acid sequence spanning 
approximately 75 amino acids and is found in almost one-third of the 300 to 700 genes encoding 
C2H2 zinc fingers. The KRAB domain is found N-teraoinally with respect to the finger repeats. The 
KRAB domain is generally encoded by two exons; the KRAB-A region or box is encoded by one 

15 exon and the KRAB-B region or box is encoded by a second exon. The function of the KRAB 

domain is the repression of transcription. Transcription repression is accomplished by recruitment of 
either the KRAB-associated protein-1, a transcriptional corepressor, or the KRAB-A interacting 
protein. Proteins containing the KRAB domain are likely to play a regulatory role during 
development (WiUiams, AJ. et al. (1999) Mol. Cell Biol. 19:8526-8535). A subgroup of highly 

20 related hmnan KRAB zinc finger proteins detectable in all human tissues is highly expressed in 

human T lymphoid cells (Bellefroid, E.J. et al. (1993) EMBO J. 12:1363-1374). The ZNF85 KRAB 
zinc finger gene, a member of the hunoian ZNF91 family, is highly expressed in normal adult testis, in 
seminomas, and in the NT2/D1 teratocarcinoma cell line (Poncelet, D.A. et al. (1998) DNA Cell 
BioL17:931-943). 

25 Additional zinc finger-associated proteins include the sprouty (SPRY) protein, which was 

first identified in a genetic screen in Drosophila. SPRY proteins are classified by virtue of their 
characteristic cysteine-rich residues located in their carboxyl termini (Wong, E.S.M., et al. (2001) J. 
Biol. Chem. 276:5866-5875). Zinc-binding B-box motifs are located within the B30.2-like domain, 
constituting a diverse family of proteins (Seto, M.H., et al. (1999) Proteins 35:235-249). The 

30 functions of these domains include regulation of cell growth and differentiation. The SPRY domain 
has been identified as a subdomain within the B30.2-like domain (Torok, M. and Etkin, L.D. (2001) 
Differentiation 67:63-71). The B-box domain itself is involved in growth control and transcriptional 
regulation. These genes possess several conserved motifs that always include a B-box zinc binding 
motif associated with various other motifs such as the RING zinc finger. The RING finger domain is 

35 a zinc-binding Cys-His protein motif found in various proteins involved in signal transduction, gene 
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transcription, differentiation, and morphogenesis. A RING-B-box-coiled-coil (RBCC) subclass of 
RING-fibager proteins contains an NHz-terminal RING-finger followed by either single or multiple 
additional B-box zinc finger domains (Spencer, J.A., et al. (2000) J. Cell Biol. 150:771-784). Several 
RBCC proteins have been implicated in oncogenesis. The RET finger protein (RFP) also belongs to 
5 the B-box zinc finger protein family. REPs possess a tripartite motif consisting of a RING finger, a 
B-box finger, and a coiled-coil domain. RFP may become oncogenic when its tripartite motif 
becomes fused with the tyrosine kinase domain of the RET protein (Tezel, G., et al. (1999) Pathol. 
Int. 49:881-886). 

The C4 motif is found in hormone-regulated proteins. The C4 motif generally includes only 
10 2 repeats. A number of eukaxyotic and viral proteins contain a conserved cysteine-rich domaia of 
40 to 60 residues (called C3HC4 zinc-finger or RING finger) that binds two atoms of zinc, and is 
probably involved in mediatiug protein-protein interactions. The 3D "cross-brace" structure of the 
zinc ligation system is unique to the RING domain. The spacing of the cysteines in such a domain is 
C-x(2)-C-x(9 to 39)-C-x(l to 3)-H-x(2 to3)-C-x(2)-C-x(4 to 48)-C-x(2)-C. The PHD finger is a 
15 C4HC3 zinc-finger-like motif found iu nuclear proteins thought to be involved in chromatin-mediated 
transcriptional regulation. 

GATA-type transcription factors contain one or two ziac finger domains which bind 
specifically to a region of DNA that contains the consecutive nucleotide sequence GATA. NMR 
studies indicate that the zinc finger comprises two irregular anti-parallel b sheets and an a helix, 
20 followed by a long loop to the C-terminal end of the finger (Ominchinski, J.G. (1993) Science 

261:438-446). The helix and the loop connecting the two b-sheets contact the major groove of the 
DNA, while the C-terminal part, which determines the specificity of binding, wraps around into the 
minor groove. 

The LIM motif consists of about 60 amino acid residues and contains seven conserved 
25 cysteine residues and a histidine within a consensus sequence (Schmeichel, K.L. and M.C. Beckerle 
(1994) Cell 79:211-219). The LIM family includes transcription factors and cytoskeletal proteins 
which may be involved in development, differentiation, and cell growth. One example is actin- 
binding LIM protein, which may play roles in regulation of the cytoskeleton and cellular 
morphogenesis (Roof, DJ. et al. (1997) J. Cell Biol. 138:575-588). The N-terminal domain of actin- 
30 binding LIM protein has four double zinc finger motifs with the LIM consensus sequence. The C- 
terminal domain of actin-binding LIM protein shows sequence similarity to known actin-binding 
proteins such as dematin and villin. Actin-binding LIM protein binds to F-actin through its dematin- 
like C-terminal domaia. The LIM domain may mediate protein-protein interactions with other LIM- 
binding proteins. 

35 Myeloid cell development is controlled by tissue-specific transcription factors. Myeloid zinc 
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finger proteins (MZF) include MZF-1 and MZF-2. MZF-1 functions in regulation of the development 
of neutrophilic granulocytes. A murine homolog MZF-2 is expressed in myeloid cells, particularly in 
the cells committed to the neutrophilic lineage. MZF-2 is down-regulated by G-CSF and appears to 
have a unique function in neutrophil development (Murai, K. et al. (1997) Genes Cells 2:581-591). 
5 The leucine zipper motif comprises a stretch of amino acids rich in leucine which can form an 

amphipathic a helix. This structure provides the basis for dimerization of two leucine zipper proteins. 
The region adjacent to the leucine zipper is usually basic, and upon protein dimerization, is optimally 
positioned for binding to the major groove. Proteins containing such motifs are generally referred to 
as bZBP transcription factors. The leucine zipper motif is found in the proto-oncogenes Fos and Jun, 

10 which comprise the heterodimeric transcription factor API involved in cell growth and the 
determination of cell lineage (Papavassiliou, A.G. (1995) N. Engl. J. Med. 332:45-47). 

Many neoplastic disorders in humans can be attributed to inappropriate gene expression. 
Malignant cell growth may result from either excessive expression of tumor promoting genes or 
insufficient expression of timior suppressor genes (Cleary, M.L. (1992) Cancer Surv. 15:89-104). 

15 Chromosomal translocations may also produce chimeric loci which fuse the coding sequence of one 
gene with the regulatory regions of a second unrelated gene. Such an arrangement likely results in 
inappropriate gene transcription, potentially contributing to malignancy. One clinically relevant zinc- 
finger protein is WTl, a tumor-suppressor protein that is inactivated in children with Wilm's tumor. 
The oncogene bcl-6, which plays an important role in large-cell lymphoma, is also a zinc-finger 

20 protein (Papavassiliou, A.G. supra). 

The helix-loop-heUx motif (HLH) consists of a short a helix connected by a loop to a longer a 
helix. The loop is flexible and allows the two helices to fold back against each other and to bind to 
DNA. The transcription factor Myc contains a prototypical HLH motif. 

The NF-kappa-B/Rel signature defines a family of eukaryotic transcription factors involved in 

25 oncogenesis, embryonic development, differentiation and immune response. Most transcription 
factors containing the Rel homology domain (RHD) bind as dimers to a consensus DNA sequence 
motif termed kappa-B. Members of the Rel family share a highly conserved 300 amino acid domain 
termed the Rel homology domain. The characteristic Rel C-terminal domain is involved in gene 
activation and cytoplasmic anchoring functions. Proteins known to contain the RHD domain include 

30 vertebrate nuclear factor NF-kappa-B, which is a heterodimer of a DNA-binding subunit and the 

transcription factor p65, mammalian transcription factor RelB, and vertebrate proto-oncogene c-rel, a 
protein associated with differentiation and lymphopoiesis (Kabrun, N. and P.J. Enrietto (1994) Semin. 
Cancer Biol. 5:103-112). 

A DNA binding motif termed ARID (AT-rich interactive domain) distinguishes an 

35 evolutionarily conserved family of proteins. The approximately 100-residue ARID sequence is 
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present in a series of proteins strongly implicated in the regulation of cell growth, development, and 
tissue-specific gene expression. ARID proteins include Bright (a regulator of B -cell-specific gene 
expression), dead ringer (involved in development), and MRF-2 (which represses expression from the 
cytomegalovirus enhancer) (Dallas, P.B. et al. (2000) Mol. Cell Biol. 20:3137-3146). 
5 The ELM2 (Egl-27 and MTAl homology 2) domain is found in metastasis-associated protein 

MTAl and protein ERl. The Caenorhahditis elegans gene egl-27 is required for embryonic 
patterning MTAl, a human gene with elevated expression in metastatic carcinomas, is a component 
of a protein complex with histone deacetylase and nucleosome remodelling activities (Solari, F. et aL 
(1999) Development 126:2483-2494). The ELM2 domain is usually found to the N terminus of a 

10 myb-like DNA binding domain. ELM2 is also found associated with an ARID DNA. 

The Iroquois (Irx) family of genes are found in nematodes, insects and vertebrates. Irx genes 
usually occur in one or two genomic clusters of three genes each and encode transcriptional 
controllers that possess a characteristic homeodomain. The Irx genes function early in development 
to specify the identity of diverse territories of the body. Later in development in both Drosophila and 

15 vertebrates, the Irx genes function again to subdivide those territories into smaller domains. (For a 
review of Iroquois genes, see Cavodeassi, F. et al. (2001) Development 128:2847-2855.) For 
example, mouse and human L:x4 proteins are 83% conserved and their 63-aa homeodomain is more 
than 93% identical to that of the Drosophila Iroquois patterning genes. Irx4 transcripts are 
predominantly expressed in the cardiac ventricles. The homeobox gene Irx4 mediates ventricular 

20 differentiation during cardiac development (Bruneau, B.G. et al. (2000) Dev. Biol. 217:266-77). 

HistidiQe triad (HIT) proteins share residues in distinctive dimeric, 10-stranded half-barrel 
structures that form two identical purine nucleotide-binding sites. Hint (histidine triad 
nucleotide-binding protein)-related proteins, found in all forms of life, and fragile histidine triad 
(Fhit)-related proteins, found in animals and fungi, represent the two main branches of the HIT 

25 superfamily. Fhit homologs bind and cleave diadenosine polyphosphates. Fhit-Ap(n)A complexes 
appear to function in a proapoptotic tumor suppression pathway in epithelial tissues (Brenner C. et al. 
(1999) J. Cell Physiol.l81:179-187). 

Most transcription factors contain characteristic DNA binding motifs, and variations on the 
above motifs and new motifs have been and are currently being characterized. (Faisst, S. and S. 

30 Meyer (1992) Nucleic Acids Res. 20:3-26.) 

Most transcription factors contain characteristic DNA binding motifs, and variations on the 
above motifs and new motifs have been and are currently being characterized (Faisst, S. and S. Meyer 
(1992) Nucl. Acids Res. 20:3-26). These include the forkhead motif, found in transcription factors 
involved in development and oncogenesis (Hacker, U. et al. (1995) EMBO J. 14:5306-5317). Foxj2 

35 is a human forkhead transcriptional activator that binds DNA with a dual sequence specificity. Foxj2 



6 



wo 03/006618 



PCT/US02/21971 



expression is activated early in zygotic development (Granadino, B. et al. (2000) Mech. Dev. 97:157- 
160). 

Cold-shock proteins (Csp) are involved in a specific pattern of gene expression in response to 
abrupt shifts to lower temperatures. This pattern includes the induction of cold-shock proteins, 
5 synthesis of proteins involved in transcription and translation, and repression of heat-shock proteins. 
The major cold-shock protein, cold-shock protein A (CspA), has high sequence similarity with three 
other proteins— CspB, CspC, and CspD. The Csp proteins share sequence similarity with other 
prokaryotic proteins and with the 'cold-shock domain' of eukaryotic Y-box proteins (Jones, P.G. and 
Inouye, M. (1994) Mol. Microbiol. 11:811-818). 

10 Chromatin Associated Proteins 

Ih the nucleus, DNA is packaged into chromatin, the compact organization of which limits the 
accessibility of DNA to transcription factors and plays a key role in gene regulation. (Lewin, supra, 
pp. 409-410.) The compact structure of chromatin is determined and influenced by chromatin- 
associated proteins such as the histones, the high mobility group (HMG) proteins, and the 

15 chromodomain proteins. There are five classes of histones, HI, H2A, H2B, H3, and H4, all of which 
are highly basic, low molecular weight proteias. The fundamental unit of chromatin, the nucleosome, 
consists of 200 base pairs of DNA associated with two copies each of H2A, H2B, H3, and H4. HI 
links adjacent nucleosomes. HMG proteins are low molecular weight, non-histone proteins that may 
play a role in unwinding DNA and stabilizing siugle-stranded DNA. Chromodomain proteias play a 

20 key role in the formation of highly compacted heterochromatin, which is transcriptionally silent. 
Diseases and Disorders Related to C^ene Regulation 

Mauy neoplastic disorders in hunaans can be attributed to inappropriate gene expression. 
Malignant cell growth may result from either excessive expression of tumor promoting genes or 
insufficient expression of tumor suppressor genes (Cleary, M.L. (1992) Cancer Surv. 15:89-104). 

25 The zinc finger-type transcriptional regulator WTl is a tumor-suppressor protein that is inactivated in 
children with Wilm's tumor. The oncogene bcl-6, which plays an important role in large-cell 
lymphoma, is also a zinc-finger protein (Papavassiliou, A.G. (1995) N. Engl. J. Med. 332:45-47). 
Chromosomal translocations may also produce chimeric loci that fuse the coding sequence of one 
gene with the regulatory regions of a second unrelated gene. Such an arrangement likely results in 

30 inappropriate gene transcription, potentially contributing to malignancy. Ih Burkitt's lymphoma, for 
example, the transcription factor Myc is translocated to the immunoglobulin heavy chain locus, 
greatly enhancing Myc expression and resulting in rapid cell growth leading to leukemia (Latchman, 
D.S. (1996) N. Engl. J. Med. 334:28-33). 

In addition, the immune system responds to infection or trauma by activating a cascade of 

35 events that coordinate the progressive selection, amplification, and mobilization of cellular defense 
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mechanisms. A complex and balanced program of gene activation and repression is involved in this 
process. However, hyperactivity of the immune system as a result of improper or insufficient 
regulation of gene expression may result in considerable tissue or organ damage. This damage is 
well-documented in immunological responses associated with arthritis, allergens, heart attack, stroke, 
5 and infections (Isselbacher, K. J. et al. Harrison's Principles of Intemal Medicine , 13/e, McGraw Hill, 
Inc. and Teton Data Systems Software, 1996). The causative gene for autoimmune 
polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED) was recently isolated and found to 
encode a protein with two PHD-type zinc finger motifs (Bjorses, P, et al. (1998) Hum. Mol. Genet. 
7:1547-1553). 

10 Furthermore, the generation of multicellular organisms is based upon the induction and 

coordination of cell differentiation at the appropriate stages of development. Central to this process is 
differential gene expression, which confers the distinct identities of cells and tissues throughout the 
body. Failure to regulate gene expression during development could result in developmental 
disorders. Human developmental disorders caused by mutations in zinc finger-type transcriptional 

15 regulators include: urogenital developmental abnormalities associated with WTl; Greig 

cephalopolysyndactyly, Pallister-Hall syndrome, and postaxial Polydactyly type A (GLI3), and 
Townes-Brocks syndrome, characterized by anal, renal, limb, and ear abnormalities (SALLl) 
(Engelkamp, D. and V. van Heyningen (1996) Curr. Opin. Genet. Dev. 6:334-342; Kohlhase, J. et al. 
(1999) Am. J. Hum. Genet. 64:435-445). 

20 Human acute leukemias involve reciprocal chromosome translocations that fuse the ALL-1 

gene located at chromosome region 1 lq23 to a series of partner genes positioned on a variety of 
human chromosomes. The fused genes encode chimeric proteins. The AF17 gene encodes a protein 
of 1093 amino acids, containing a leucine-zipper dimerization motif located 3' of the fusion point and 
a cysteine-rich domain at the N terminus that shows homology to a domain within the protein Brl40 

25 (peregrin) (Prasad R. et al. (1994) Proc. Natl. Acad. Sci. USA 91:8107-8111). 
SYNTHESIS OF NUCLEIC ACIDS 
Polymerases 

DNA and RNA replication are critical processes for cell replication and function. DNA and 
RNA replication are mediated by the enzymes DNA and RNA polymerase, respectively, by a 
30 "templating" process in which the nucleotide sequence of a DNA or RNA strand is copied by 
complementary base-pairing into a complementary nucleic acid sequence of either DNA or RNA. 
However, there are fundamental differences between the two processes. 

DNA polymerase catalyzes the stepwise addition of a deoxyribonucleotide to the 3'-OH end 
of a polynucleotide strand (the primer strand) that is paired to a second (template) strand. The new 
35 DNA strand therefore grows in the 5' to 3' direction (Alberts, B. et al. (1994) The Molecular Biologv 
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of the Cell , Garland Publishing Inc., New York, NY, pp 251-254). The substrates for the 
polymerization reaction are the corresponding deoxynucleotide triphosphates which must base-pair 
with the correct nucleotide on the template strand in order to be recognized by the polymerase. 
Because DNA exists as a double-stranded helix, each of the two strands may serve as a template for 
5 the formation of a new complementary strand. Each of the two daughter cells of a dividing cell 
therefore inherits a new DNA double helix containing one old and one new strand. Thus, DNA is 
said to be replicated "semiconservatively" by DNA polymerase. In addition to the synthesis of new 
DNA, DNA polymerase is also involved in the repair of damaged DNA as discussed below under 
"Ligases." 

10 In contrast to DNA polymerase, RNA polymerase uses a DNA template strand to "transcribe' 

DNA into RNA using ribonucleotide triphosphates as substrates. Like DNA polymerization, RNA 
polymerization proceeds in a 5* to 3* direction by addition of a ribonucleoside monophosphate to the 
3 -OH end of a growing RNA chain. DNA transcription generates messenger RNAs (mRNA) that 
carry information for protein synthesis, as well as the transfer, ribosomal, and other RNAs that have 

15 structural or catalytic functions. In eukaryotes, three discrete RNA polymerases synthesize the three 
different types of RNA (Alberts, supra, pp. 367-368). RNA polymerase I makes the large ribosomal 
RNAs, RNA polymerase n makes the mRNAs that will be translated into proteins, and RNA 
polymerase m makes a variety of small, stable RNAs, including 5S ribosomal RNA and the transfer 
RNAs (tRNA). In all cases, RNA synthesis is initiated by binding of the RNA polymerase to a 

20 promoter region on the DNA and synthesis begins at a start site within the promoter. Synthesis is 
completed at a stop (termination) signal in the DNA whereupon both the polymerase and the 
completed RNA chain are released. 
Ligases 

DNA repair is the process by which accidental base changes, such as those produced by 
25 oxidative damage, hydrolytic attack, or uncontrolled methylation of DNA, are corrected before 
replication or transcription of the DNA can occur. Because of the efficiency of the DNA repair 
process, fewer than one in a thousand accidental base changes causes a mutation (Alberts, supra, pp. 
245-249). The three steps common to most types of DNA repair are (1) excision of the damaged or 
altered base or nucleotide by DNA nucleases, (2) insertion of the correct nucleotide in the gap left by 
30 the excised nucleotide by DNA polymerase using the complementary strand as the template and, (3) 
sealing the break left between the inserted nucleotide(s) and the existing DNA strand by DNA ligase. 
In the last reaction, DNA ligase uses the energy from ATP hydrolysis to activate the 5' end of the 
broken phosphodiester bond before forming the new bond with the 3'-OH of the DNA strand. In 
Bloom's syndrome, an inlierited human disease, individuals are partially deficient in DNA ligation 
35 and consequently have an increased incidence of cancer (Alberts, supra p. 247). 
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Nucleases 

Nucleases comprise enzymes that hydrolyze both DNA (DNase) and RNA (Rnase). They 
serve different purposes in nucleic acid metabolism. Nucleases hydrolyze the phosphodiester bonds 
between adjacent nucleotides either at intemal positions (endonucleases) or at the terminal 3' or 5' 
5 nucleotide positions (exonucleases). A DNA exonuclease activity in DNA polymerase, for example, 
serves to remove improperly paired nucleotides attached to the 3 -OH end of the growing DNA strand 
by the polymerase and thereby serves a "proofreading" function. As mentioned above, DNA 
endonuclease activity is involved in the excision step of the DNA repair process. 

RNases also serve a variety of functions. For example, RNase P is a ribonucleoprotein 

10 enzyme which cleaves the 5' end of pre-tRNAs as part of their maturation process. RNase H digests 
the RNA strand of an RNA/DNA hybrid. Such hybrids occur in cells invaded by retroviruses, and 
RNase H is an important enzyme in the retroviral replication cycle. Pancreatic RNase secreted by the 
pancreas into the intestine hydrolyzes RNA present in ingested foods. RNase activity in serum and 
cell extracts is elevated in a variety of cancers and infectious diseases (Schein, C.H. (1997) Nat. 

15 Biotechnol. 15:529-536). Regulation of RNase activity is being investigated as a means to control 
tumor angiogenesis, allergic reactions, viral infection and replication, and fungal infections. 
MODIFICATION OF NUCLEIC ACIDS 
Methvlases 

Methylation of specific nucleotides occurs in both DNA and RNA, and serves different 

20 functions in the two macromolecules. Methylation of cytosine residues to form 5-methyl cytosine in 
DNA occurs specifically in CG sequences which are base-paired with one another in the DNA 
double-helix. The pattern of methylation is passed from generation to generation during DNA 
replication by an enzyme called "maintenance methylase" that acts preferentially on those CG 
sequences that are base-paired with a CG sequence that is already methylated. Such methylation 

25 appears to distinguish active from inactive genes by preventing the binding of regulatory proteins that 
"tum on" the gene, but permiting the binding of proteins that inactivate the gene (Alberts, supra pp. 
448-451). In RNA metabolism, "tRNA methylase" produces one of several nucleotide modifications 
in tRNA that affect the conformation and base-pairing of the molecule and facilitate the recognition 
of the appropriate mRNA codons by specific tRNAs. The primary methylation pattem is the 

30 dimethylation of guanine residues to form N,N-dimethyl guanine. 
Helicases and Single-stranded Binding Proteins 

Helicases are enzymes that destabilize and unwind double helix structures in both DNA and 
RNA. Since DNA replication occurs more or less simultaneously on both strands, the two strands 
must first separate to generate a replication "fork" for DNA polymerase to act on. Two types of 

35 replication proteins contribute to this process, DNA helicases and single-stranded binding proteins. 
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DNA helicases hydrolyze ATP and use the energy of hydrolysis to separate the DNA strands. Single- 
stranded binding proteins (SSBs) then bind to the exposed DNA strands, without covering the bases, 
thereby temporarily stabilizing them for templating by the DNA polymerase (Alberts, supra^ pp. 255- 
256). 

5 RNA helicases also alter and regulate RNA conformation and secondary structure. Like the 

DNA helicases, RNA helicases utilize energy derived from ATP hydrolysis to destabilize and unwind 
RNA duplexes. The most well-characterized and ubiquitous family of RNA helicases is the DEAD- 
box family, so named for the conserved B-type ATP-binding motif which is diagnostic of proteins in 
this family. Over 40 DEAD-box helicases have been identified in organisms as diverse as bacteria, 

10 insects, yeast, amphibians, mammals, and plants. DEAD-box helicases function in diverse processes 
such as translation initiation, splicing, ribosome assembly, and RNA edithig, transport, and stability. 
Examples of these RNA helicases include yeast Drsl protein, which is involved in ribosomal RNA 
processing; yeast TBPl and TIF2 and mammalian eIF-4A', which are essential to the initiation of RNA 
translation; and hmnan p68 antigen, which regulates cell growth and division (Ripmaster, T.L. et al. 

15 (1992) Proc. Natl. Acad. Sci. USA 89:11131-11135; Chang, T.-H. et al. (1990) Proc. Natl. Acad. Sci. 
USA 87: 1571-1575). These RNA helicases demonstrate strong sequence homology over a stretch of 
some 420 amino acids. Included among these conserved sequences are the consensus sequence for 
the A motif of an ATP binding protein; the "DEAD box" sequence, associated with ATPase activity; 
the sequence SAT, associated with the actual helicase unwinding region; and an octapeptide . 

20 consensus sequence, required for RNA binding and ATP hydrolysis (Pause, A, et al. (1993) Mol. Cell 
Biol. 13:6789-6798). Differences outside of these conserved regions are believed to reflect 
differences in the fimctional roles of individual proteins (Chang et al., supra). 

Some DEAD-box helicases play tissue- and stage-specific roles in spermatogenesis and 
embryogenesis. Overexpression of the DEAD-box 1 protein (DDXl) may play a role in the 

25 progression of neuroblastoma (Nb) and retinoblastoma (Rb) tumors (Godbout, R. et al. (1998) J. Biol. 
Chem. 273:21 161-21 168). These observations suggest that DDXl may promote or enhance tiunor 
progression by altering the normal secondary structure and expression levels of RNA in cancer cells. 
Other DEAD-box helicases have been implicated either directly or indirectly in tumorigenesis. 
(Discussed in Godbout et al., supra,) For example, murine p68 is mutated in ultraviolet light-induced 

30 tumors, and human DDX6 is located at a chromosomal breakpoint associated with B-cell lymphoma. 
Similarly, a chimeric protein comprised of DDXl 0 and NUP98, a nucleoporin protein, may be 
involved in the pathogenesis of certain myeloid malignancies. 
Topoisomerases 

Besides the need to separate DNA strands prior to replication, the two strands must be 
35 "unwound" from one another prior to their separation by DNA helicases. This function is performed 
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by proteins known as DNA topoisomerases. DNA topoisomerase effectively acts as a reversible 
nuclease that hydrolyzes a phosphodiesterase bond in a DNA strand, permits the two strands to rotate 
freely about one another to remove the strain of the helix, and then rejoins the original phosphodiester 
bond between the two strands. Topoisonnierases are essential enzymes responsible for the topological 
5 rearrangement of DNA brought about by transcription, replication, chromatin formation, 

recombination, and chromosome segregation. Superhelical coils are introduced into DNA by the 
passage of processive enzymes such as RNA polymerase, or by the separation of DNA strands by a 
helicase prior to replication. Knotting and concatenation can occur in the process of DNA synthesis, 
storage, and repair. All topoisomerases work by breaking a phosphodiester bond in the ribose- 

10 phosphate backbone of DNA. A catalytic tyrosine residue on the enzyme makes a nucleophilic attack 
on the scissile phosphodiester bond, resulting in a reaction intermediate in which a covalent bond is 
formed between the enzyme and one end of the broken strand. A tyrosine-DNA phosphodiesterase 
functions in DNA repair by hydrolyzing this bond in occasional dead-end topoisomerase I-DNA 
mtermediates (Pouliot, J.J. et al. (1999) Science 286:552-555). 

15 Two types of DNA topoisomerase exist, types I and 11. Type I topoisomerases work as 

monomers, making a break in a single strand of DNA while type n topoisomerases, working as 
homodimers, cleave both strands. DNA Topoisomerase I causes a single-strand break in a DNA helix 
to allow the rotation of the two strands of the helix about the remaining phosphodiester bond in the 
opposite strand. DNA topoisomerase II causes a transient break in both strands of a DNA helix 

20 where two double helices cross over one another. This type of topoisomerase can efficiently separate 
two interlocked DNA circles (Alberts, supra, pp.260-262). Type II topoisomerases are largely 
confined to proliferating cells in eukaryotes, such as cancer ceUs. For this reason they are targets for 
anticancer drugs. Topoisomerase n has been implicated in multi-drug resistance (MDR) as it appears 
to aid in the repair of DNA damage inflicted by DNA binding agents such as doxorubicin and 

25 vincristine (DNA topoisomerases are reviewed in Wang, J.C. (1996) Annu. Rev. Biochem. 65:635- 
692.). 

The topoisomerase I family includes topoisomerases I and HI (topo I and topo HI). The 
crystal structure of human topoisomerase I suggests that rotation about the intact DNA strand is 
partially controlled by the enzyme. In this "controlled rotation" model, protein-DNA interactions 

30 limit the rotation, which is driven by torsional strain in the DNA (Stewart, L. et al. (1998) Science 

379: 1534-1541). Structurally, topo I can be recognized by its catalytic tyrosine residue and a number 
of other conserved residues in the active site region. Topo I is thought to function during 
transcription. Two topo His are known in humans, and they are homologous to prokaryotic 
topoisomerase I, with a conserved tyrosine and active site signature specific to this family. Topo IE 

35 has been suggested to play a role in meiotic recombination. A mouse topo HI is highly expressed in 



12 



wo 03/006618 



PCT/US02/21971 



testis tissue and its expression increases with the increase in the number of cells in pachytene (Seki, 
T. et al. (1998) J. Biol. Chem. 273:28553-28556). 

The topoisomerase n family includes two isozymes (Ha and lib) encoded by different genes. 
Topo n cleaves double stranded DNA in a reproducible, nonrandom fashion, preferentially in an AT 
5 rich region, but the basis of cleavage site selectivity is not known. Structurally, topo II is made up of 
four domains, the first two of which are structurally similar and probably distantly homologous to 
similar domains in eukaryotic topo L The second domain bears the catalytic tyrosine, as well as a 
highly conserved pentapeptide. The Ha isoform appears to be responsible for unlinking DNA during 
chromosome segregation. Cell lines expressing Ha but not lib suggest that Hb is dispensable in 

10 cellular processes; however, lib knockout mice died perinatally due to a failure in neural 

development. That the major abnormalities occurred in predonainantly late developmental events 
(neurogenesis) suggests that lib is needed not at mitosis, but rather during DNA repair (Yang, X. et 
al. (2000) Science 287:131-134). 

Topoisomerases have been implicated in a number of disease states, and topoisomerase 

15 poisons have proven to be effective anti-tumor drugs for some human malignancies. Topo I is 
mislocalized in Fanconi's anemia, and may be involved in the chromosomal breakage seen in this 
disorder (Wunder, E. (1984) Hum. Genet. 68:276-281). Overexpression of a truncated topo m in 
ataxia-telangiectasia (A-T) cells partially suppresses the A-T phenotype, probably through a dominant 
negative mechanism. This suggests that topo m is deregulated in A-T (Fritz, E. et al. (1997) Proc. 

20 Natl. Acad. Sci. USA 94:4538-4542). Topo m also interacts with the Bloom's Syndrome gene 
product, and has been suggested to have a role as a tumor suppressor (Wu, L. et al. (2000) J. Biol. 
Chem. 275:9636-9644). Aberrant topo n activity is often associated with cancer or increased cancer 
risk. Greatly lowered topo n activity has been found in some, but not all A-T cell lines (Mohamed, R. 
et al. (1987) Biochem. Biophys. Res. Commun. 149:233-238). On the other hand, topo n can break 

25 DNA in the region of the A-T gene (ATM), which controls aU DNA damage-responsive cell cycle 
checkpoints (Kaufinann, W.K. (1998) Proc. Soc. Exp. Biol. Med. 217:327-334). The ability of 
topoisomerases to break DNA has been used as the basis of antitumor drugs. Topoisomerase poisons 
act by increasing the number of dead-end covalent DNA-enzyme complexes in the cell, ultimately 
triggering cell death pathways (Fortune, J.M. and N. Osheroff (2000) Prog. Nucleic Acid Res. Mol. 

30 Biol. 64:221-253; Guichard, S.M. and M.K. Danks (1999) Curr. Opin. Oncol. 11:482-489). 

Antibodies against topo I are found in the serum of systemic sclerosis patients, and the levels of the 
antibody may be used as a marker of pulmonary involvement in the disease (Diot, E. et al. (1999) 
Chest 1 16:715-720). Finally, the DNA binding region of human topo I has been used as a DNA 
delivery vehicle for gene therapy (Chen, T.Y. et al. (2000) Appl. Microbiol. Biotechnol. 53:558-567). 

35 Recombinases 
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Genetic recombination is the process of rearranging DNA sequences within an organism's 
genome to provide genetic variation for the organism in response to changes in the environment. 
DNA recombination allows variation in the particular combination of genes present in an individual's 
genome, as well as the timing and level of expression of these genes. (See Alberts, supra pp. 263- 
5 273.) Two broad classes of genetic recombination are commonly recognized, general recombination 
and site-specific recombination. General recombination involves genetic exchange between any 
homologous pair of DNA sequences usually located on two copies of the same chromosome. The 
process is aided by enzymes, recombinases, that "nick" one strand of a DNA duplex more or less 
randomly and permit exchange with a complementary strand on another duplex. The process does not 

10 normally change the arrangement of genes in a chromosome. In site-specific recombination, the 
recombinase recognizes specific nucleotide sequences present in one or both of the recombining 
molecxxles. Base-pairing is not involved in this form of recombination and therefore it does not 
require DNA homology between the recombining molecules. Unlike general recombination, this 
form of recombination can alter the relative positions of nucleotide sequences in chromosomes. 

15 RNA METABOLISM 

Ribonucleic acid (RNA) is a linear single-stranded polymer of four nucleotides, ATP, CTP, 
UTP, and GTP. In most organisms, RNA is transcribed as a copy of deoxyribonucleic acid (DNA), 
the genetic material of the organism. In retroviruses RNA rather than DNA serves as the genetic 
material. RNA copies of the genetic material encode proteins or serve various structural, catalytic, or 

20 regulatory roles in organisms. RNA is classified accordiug to its cellular localization and function. 
Messenger RNAs (mRNAs) encode polypeptides. Ribosomal RNAs (rRNAs) are assembled, along 
with ribosomal proteins, into ribosomes, which are cytoplasmic particles that translate mRNA into 
polypeptides. Transfer RNAs (tRNAs) are cytosolic adaptor molecules that function in mRNA 
translation by recognizing both an mRNA codon and the amino acid that matches that codon. 

25 Heterogeneous nuclear RNAs (hnRNAs) include mRNA precursors and other nuclear RNAs of 
various sizes. Small nuclear RNAs (snRNAs) are a part of the nuclear spliceosome complex that 
removes intervening, non-coding sequences (introns) and rejoins exons in pre-mRNAs. 

Proteins are associated with RNA during its transcription from DNA, RNA processing, and 
translation of mRNA into protein. Proteins are also associated with RNA as it is used for structural, 

30 catalytic, and regulatory purposes. 
RNA Processing 

Ribosomal RNAs (rRNAs) are assembled, along with ribosomal proteins, into ribosomes, 
which are cytoplasmic particles that translate messenger RNA (mRNA) into polypeptides. The 
eukaryotic ribosome is composed of a 60S (large) subunit and a 408 (small) subunit, which together 
35 form the SOS ribosome. In addition to the 18S, 28S, 5S, and 5.8S rRNAs, ribosomes contain from 50 
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to over 80 different ribosomal proteins, depending on the organism. Ribosomal proteins are classified 
according to which subunit they belong (i.e., L, if associated with the large 60S large subunit or S if 
associated with the small 40S subunit). £. coli ribosomes have been the most thoroughly studied and 
contain 50 proteins, many of which are conserved in all life forms. The structures of nine ribosomal 
5 proteins have been solved to less than 3.0D resolution (i.e., S5, S6, S17, LI, L6, L9, L12, L14, L30), 
revealing conamon motifs, such as b-a-b protein folds in addition to acidic and basic RNA-binding 
motifs positioned between b-strands. Most ribosomal proteins are believed to contact rRNA directly 
(reviewed in Liljas, A. and M. Garber (1995) Curr. Opin. Struct. Biol. 5:721-727; see also Woodson, 
S A. and N.B. Leontis (1998) Curr. Opin. Struct. Biol. 8:294-300; Ramakrishnan, V. and S.W. White 

10 (1998) Trends Biochem. Sci. 23:208-212). 

Ribosomal proteins naay undergo post-translational modifications or interact with other 
ribosome-associated proteins to regulate translation. For example, the highly homologous 40S 
ribosomal protein 86 kinases (S6K1 and S6K2) play a key role in the regulation of cell growth by 
controlling the biosynthesis of translational components which naake up the protein synthetic 

15 apparatus (including the ribosomal proteins). In the case of S6K1, at least eight phosphorylation sites 
are believed to mediate kinase activation in a hierarchical fashion (Dufher and Thomas (1999) Exp. 
Cell. Res. 253: 100-109). Some of the ribosomal proteins, including LI, also function as translational 
repressors by binding to polycistronic naRNAs encoding ribosomal proteins (reviewed in Liljas and 
Garber, supra). 

20 Recent evidence suggests that a number of ribosomal proteins have secondary functions 

independent of their involvement in protein biosynthesis. These proteins function as regulators of 
cell proliferation and, in some instances, as inducers of cell death. For example, the expression of 
human ribosomal protein LI 3a has been shown to induce apoptosis by arresting cell growth in the 
G2/M phase of the cell cycle. Inhibition of expression of L13a induces apoptosis in target cells, 

25 which suggests that this protein is necessary, in the appropriate amount, for cell survival. Similar 
results have been obtained in yeast where inactivation of yeast homologues of L13a, rp22 and rp23, 
results in severe growth retardation and death. A closely related ribosomal protein, L7, arrests cells 
in Gl and also induces apoptosis. Thus, a subset of ribosomal proteins may function as cell cycle 
checkpoints and compose a new family of cell proliferation regulators. 

30 Mapping of individual ribosomal proteins on the surface of intact ribosomes is accomplished 

using 3D immunocryoelectronmicroscopy, whereby antibodies raised against specific ribosomal 
proteins are visualized. Progress has been made toward the mapping of LI, L7, and L12 while the 
structure of the intact ribosome has been solved to only 20-25D resolution and inconsistencies exist 
among different crude structures (Frank, J. (1997) Curr. Opin. Struct. Biol. 7:266-272). 

35 Three distinct sites have been identified on the ribosome. The aininoacyl-tRNA acceptor site 



15 



wo 03/006618 



PCT/US02/21971 



(A site) receives charged tRNAs (with the exception of the initiator-tRNA). The peptidyl-tRNA site 
(P site) binds the nascent polypeptide as the amino acid from the A site is added to the elongating 
chain. Deacylated tRNAs bind in the exit site (E site) prior to their release from the ribosome. The 
structure of the ribosome is reviewed in Stryer, L. (1995) Biochemistrv . W.H. Freenoan and 
5 Company, New York NY, pp. 888-9081; Lodish, supra, pp. 1 19-138; and Lewin, B (1997) Genes VL 
Oxford University Press, Inc. New York, NY). 

Various proteins are necessary for processing of transcribed RNAs in the nucleus. Pre- 
mRNA processing steps include capping at the 5' end with methylguanosine, polyadenylating the 3' 
end, and splicing to remove introns. The primary RNA transript from DNA is a faithful copy of the 

10 gene containing both exon and intron sequences, and the latter sequences must b^ cut out of the RNA 
transcript to produce a mRNA that codes for a protein. This "splicing" of the mRNA sequence takes 
place in the nucleus with the aid of a large, multicomponent ribonucleoprotein complex known as a 
spliceosome. The spliceosomal complex is comprised of five small nuclear ribonucleoprotein 
particles (snRNPs) designated tJl, U2, U4, U5, and U6. Each snRNP contains a single species of 

15 snRNA and about ten proteins. The RNA components of some snRNPs recognize and base-pair with 
intron consensus sequences. The protein components mediate spliceosome assembly and the splicing 
reaction. Autoantibodies to snRNP proteins are found in the blood of patients with systemic lupus 
erythematosus (Stryer, supra, p. 863). 

Heterogeneous nuclear libonucleoproteins (htiRNPs) have been identified that have roles in 

20 splicing, exporting of the mature RNAs to the cytoplasm, and mRNA translation (Biamonti, G. et al. 
(1998) Clin. Exp. Rheumatol. 16:317-326). Some examples of hnRNPs include the yeast proteins 
Hiplp, involved in cleavage and polyadenylation at the 3* end of the RNA; CbpSOp, involved in 
capping the 5* end of the RNA; and Npl3p, a homolog of mammalian hnRNP Al, involved in export 
of mRNA from the nucleus (Shen, E.G. et al. (1998) Genes Dev. 12:679-691). HnRNPs have been 

25 shown to be important targets of the autoimmune response in rheumatic diseases (Biamonti, supra). 

Many snRNP and hnRNP proteins are characterized by an RNA recognition motif (RRM). 
(Reviewed in Bimey, E. et al. (1993) Nucleic Acids Res. 21:5803-5816.) The RRM is about 80 
amino acids in length and forms four b-strands and two a-helices arranged in an a /b sandwich. The 
RRM contains a core RNP-1 octapeptide motif along with surrounding conserved sequences. In 

30 addition to snRNP proteins, examples of RNA-binding proteins which contain the above motifs 
include heteronuclear ribonucleoproteins which stabilize nascent RNA and factors which regulate 
alternative splicing. Alternative splicing factors include developmentally regulated proteins, specific 
examples of which have been identified in lower eukaryotes such as Drosophila melanogaster and 
Caenorhabditis elegans. These proteins play key roles in developmental processes such as pattem 

35 fonnation and sex determination, respectively. (See, for example, Hodgkin, J. et al. (1994) 
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Development 120:3681-3689.) 

The 3' ends of most eukaryote niRNAs are also posttranscriptionally modified by 
polyadenylation. Polyadenylation proceeds through two enzymatically distinct steps: (i) the 
endonucleolytic cleavage of nascent roRNAs at cw-acting polyadenylation signals in the 
5 3 -untranslated (non-coding) region and (ii) the addition of a poly(A) tract to the 5'mRNA fragment. 
The presence of cf^-acting RNA sequences is necessary for both steps. These sequences include 5 - 
AAUAAA-3' located 10-30 nucleotides upstream of the cleavage site and a less well-conserved GU- 
or U-rich sequence element located 10-30 nucleotides downstream of the cleavage site. Cleavage 
stimulation factor (CsfF), cleavage factor I (CF I), and cleavage factor n (CF IT) are involved in the 

10 cleavage reaction while cleavage and polyadenylation specificity factor (CPSF) and poly(A) 
polymerase (PAP) are necessary for both cleavage and polyadenylation. An additional enzyme, 
poly(A)-binding protein n (PAB IQ, promotes poly(A) tract elongation (Ruegsegger, U. et al. (1996) 
J. Biol. Chem. 271:6107-6113; and references within). 

YT521-B is a nuclear protein that was identified by using a yeast two-hybrid screen for 

15 proteins that interact with known mRNA splicing factors (Hartmaim, A.M. et al. (1999) Mol. Biol. 
Cell 10:3909-3926). The protein contains four nuclear localization signals, an N-terminal glutamic 
acid-rich region, a glutamic acid/arginine-rich region, and a C-terminal proline-rich region. YT521 
associates with the nuclear transcriptosomal component scaffold attachment factor B and with the Src 
kinase substrate, Sam68. Phosphorylation of Sam68 by Src family kinase p59*^ reduces the 

20 association of Sam68 with YT521-B. Both YT521 and Sam68 may participate in a signal 
transduction pathway that controls altemative splice site selection. 
TRANSLATION 

Correct translation of the genetic code depends upon each amino acid forming a linkage with 
the appropriate transfer RNA (tRNA). The aminoacyl-tRNA synthetases (aaRSs) are essential 

25 proteins found in all living organisms. The aaRSs are responsible for the activation and correct 
attachment of an amino acid with its cognate tRNA, as the first step in protein biosynthesis. 
Prokaryotic organisms have at least twenty different types of aaRSs, one for each different amino 
acid, while eukarj^otes usually have two aaRSs, a cytosolic form and a mitochondrial form, for each 
different amino acid. The 20 aaRS enzymes can be divided into two structural classes. Class I 

30 enzymes add amino acids to the 2' hydroxyl at the 3' end of tRNAs while Class H enzymes add amino 
acids to the 3' hydroxyl at the 3' end of tRNAs. Each class is characterized by a distinctive topology 
of the catalytic domain. Class I enzymes contain a catalytic domain based on the nucleotide-binding 
Rossman 'fold' . In particular, a consensus tetrapeptide motif is highly conserved (Prosite Document 
PDOC00161, Aminoacyl-transfer RNA synthetases class-I signature). Class I enzymes are specific 

35 for arginine, cysteine, glutamic acid, glutamine, isoleucine, leucine, methionine, tyrosine, tryptophan. 
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and valine. Class II enzymes contain a central catalytic domain, which consists of a seven-stranded 
antiparallel fi-sheet domain, as well as N- and C- terminal regulatory domains. Class II enzymes are 
separated into two groups based on the heterodimeric or homodimeric structure of the enzyme; tiie 
latter group is further subdivided by the structure of the N- and C~terminal regulatory domains 
5 (Hartlein, M. and S. Cusack (1995) J. Mol. EvoL 40:519-530). Class IE enzymes are specific for 
alanine, asparagine, aspartic acid, glycine, histidine, lysine, phenylalanine, proline, serine, and 
threonine. 

Certain aaRSs also have editing functions. lleRS, for example, can misactivate valine to form 
Val-tRNA"**, but this product is cleared by a hydrolytic activity that destroys the naischarged product. 

10 This editing activity is located within a second catalytic site found in the coimective polypeptide 1 
region (CPl), a long insertion sequence within the Rossman fold domain of Class I enzymes 
(Schimmel, P. et al. (1998) FASEB J. 12: 1599-1609). AaRSs also play a role in tRNA processhig. It 
has been shown that mature tRNAs are charged with their respective amino acids in the nucleus 
before export to the cytoplasm, and charging may serve as a quality control mechanism to insure the 

15 tRNAs are functional (Martinis, S.A. et al. (1999) EMBO J. 18:4591-4596). 

Under optimal conditions, polypeptide synthesis proceeds at a rate of approximately 40 
amiao acid residues per second. The rate of misincorporation during translation in on the order of 10" 

and is primarily the result of aminoacyl-t-RNAs being charged with the incorrect amino acid. 
Incorrectly charged tRNA are toxic to cells as they result in the incorporation of incorrect amino acid 

20 residues into an elongating polypeptide. The rate of translation is presumed to be a compromise 
between the optimal rate of elongation and the need for translational fidelity. Mathematical 
calculations predict that 10"^ is indeed the maximum acceptable error rate for protein synthesis in a 
biological system (reviewed in Stryer, supra\ and Watson, J. et al. (1987) The Benjanodn/Cunmiings 
Publishing Co., Inc. Menlo Park, CA). A particularly error prone aminoacyl-tRNA charging event is 

25 the charging of tRNA^^ with Gin. A mechanism exits for the correction of this mischarging event 
which likely has its origins in evolution. Gin was among the last of the 20 naturally occurring anaino 
acids used in polypeptide synthesis to appear in nature. Gram positive eubacteria, cyanobacteria, 
Archeae, and eukaryotic organelles possess a noncanonical pathway for the synthesis of Gln-tRNA°^" 
based on the transformation of Glu-tRNA^'" (synthesized by Glu-tRNA synthetase, GluRS) using the 

30 enzyme Glu-tRNA^^ amidotransferase (Glu-AdT). The reactions involved in the transamidation 
pathway are as follows (Cumow, A.W. et al. (1997) Nucleic Acids Symposium 36:2-4): 



GluRS 

tRNA^'" + Glu + ATP Glu-tRNA°^^ + AMP + PPj 

35 
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Glu-AdT 

Glu-tRNA^^"^ + Gin + ATP ^ Gln-tRNA°^° + Glu + ADP + P 
A similar enzyme, Asp-tRNA'^^'' amidotransferase, exists in Archaea, which transforms Asp- 
tRNA"^'^ to Asn-tRNA^'". Foimylase, the enzyme that transforms Met-tRNA^^' to fMet-tRNA^^' in 
5 eubacteria, is likely to be a related enzyme. A hydrolytic activity has also been identified that 

destroys mischarged Val-tRNA"^ (Schimmel, P. et al. (1998) FASEB J. 12:1599-1609). One likely 
scenario for the evolution of Glu-AdT in prinaitive life forms is the absence of a specific glutaminyl- 
tRNA synthetase (GhiRS), requiring an alternative pathway for the synthesis of Gln-tRNA^*". In fact, 
deletion of the Glu-AdT operon in Gram positive bacteria is lethal (Cumow, A.W. et al. (1997) Proc. 
10 Natl. Acad. Sci. USA 94: 1 18 19-1 1826). The existence of GluRS activity in other organisms has been 
inferred by the high degree of conservation in translation naachinery in nature; however, GluRS has 
not been identified in all organisms, including Homo sapiens. Such an enzyme would be responsible 
for ensuring translational fidelity and reducing the synthesis of defective polypeptides. 

The different aaRSs are believed to be the result of divergent evolution, likely following gene 
15 duplication events. Notably, amino acids such as Gin, were among the last to appear in nature and 
evolutionary studies suggest that Gln-RSs appeared first in eukaryotes and were later horizontally 
transferred to prokaryotes (Lamour, V. et al. (1994) Proc. Natl. Acad. Sci. U.S.A. 91:8670-74 and 
Siatecka, M. et al. (1998) Eur. J. Biochem. 256:80-7). The unportance of Gln-RS and Ghi-tRNA°^ 
are discussed below. 

20 In addition to their function in protein synthesis, specific aminoacyl tRNA synthetases also 

play roles in cellular fidelity, RNA splicing, RNA trafficking, apoptosis, and transcriptional and 
translational regulation. For example, human tyrosyl-tRNA synthetase can be proteolytically cleaved 
into two fragments with distinct cytokine activities. The carboxy-terminal domain exhibits monocyte 
and leukocyte chemotaxis activity as well as stimulating production of myeloperoxidase, txunor 

25 necrosis factor-a, and tissue factor. The N-terminal domain binds to the interleukin-8 type A receptor 
and functions as an interleukin-8-like cytokine. Human tyrosyl-tRNA synthetase is secreted from 
apoptotic tumor cells and may accelerate apoptosis (Wakasugi, K., and Schimmel, P. (1999) Science 
284: 147-151). Mitochondrial Neurospora crassa TyrRS and S. cerevisiae LeuRS are essential factors 
for certain group I intron splicing activities, and human mitochondrial LeuRS can substitute for the 

30 yeast LeuRS in a yeast null strain. Certain bacterial aaRSs are involved in regulating their own 
transcription or translation (Martinis, supra). Several aaRSs are able to synthesize diadenosine 
oligophosphates, a class of signalling molecules with roles in cell proliferation, differentiation, and 
apoptosis (Kisselev, L.L et al. (1998) EEBS Lett. 427:157-163; Vartanian, A. et al. (1999) FEES Lett. 
456:175-180). 

35 Autoantibodies against aminoacyl-tRNAs are generated by patients with autoimmune 
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diseases such as rheumatic arthritis, dermatomyositis and polymyositis, and correlate strongly with 
complicating interstitial lung disease (ILD) (Freist, W. et al. (1999) BioL Chem, 380:623-646; Freist, 
W. et al. (1996) BioL Chem. Hoppe Seyler 377:343-356). These antibodies appear to be generated m 
response to viral infection, and coxsackie virus has been used to induce experimental viral myositis in 
5 animals. 

Comparison of aaRS structures between humans and pathogens has been useful in the design 
of novel antibiotics (Schimmel, supra). Genetically engineered aaRSs have been utilized to allow 
site-specific incorporation of unnatural amino acids into proteins in vivo (Liu, D.R. et al. (1997) Proc. 
Natl. Acad. Sci. USA 94:10092-10097). 

10 tRNA Modifications 

The modified ribonucleoside, pseudouridine (y), is present ubiquitously in the anticodon 
regions of transfer RNAs (tRNAs), large and small ribosomal RNAs (rRNAs), and small nuclear 
RNAs (snRNAs). y is the most common of the modified nucleosides (i.e., other than G, A, U, and C) 
present in tRNAs. Only a few yeast tRNAs that are not involved in protein synthesis do not contain y 

15 (Cortese, R. et al. (1974) J. Biol. Chem. 249: 1 103-1 108). The enzyme responsible for the conversion 
of uridine to y, pseudouridine synthase (pseudouridylate synthase), was first isolated from Salmonella 
typhimuHum (Arena, F. et al. (1978) Nucleic Acids Res. 5:4523-4536). The enzyme has since been 
isolated from a number of mammals, including steer and mice (Green, C.J. et al. (1982) J. Biol. 
Chem. 257:3045-52; and Chen, J. and J.R. Patton (1999) RNA 5:409-419). tRNA pseudouridine 

20 synthases have been the most extensively studied members of the family. They require a thiol donor 
(e.g., cysteine) and a monovalent cation (e.g., ammonia or potassium) for optimal activity. Additional 
cofactors or high energy molecules (e.g., ATP or GTP) are not required (Green et al., supra). Other 
eukaryotic pseudouridine synthases have been identified that appear to be specific for rRNA 
(reviewed in Smith, CM. and J.A. Steitz (1997) Cell 89:669-672) and a dual-specificity enzyme has 

25 been identified that uses both tRNA and rRNA substrates (Wrzesinski, J. et al. (1995) RNA 1: 
437-448). The absence of y in the anticodon loop of tRNAs results in reduced growth in both 
bacteria (Singer, C.E. et al. (1972) Nature New Biol. 238:72-74) and yeast (Lecointe, F. (1998) J. 
Biol. Chem. 273:1316-1323), although the genetic defect is not lethal. 

Another ribonucleoside modification that occurs primarily in eukaryotic cells is the 

30 conversion of guanosine to N^,N^-dimethylguanosine (joc^j^) at position 26 or 10 at the base of the 
D-stem of cytosolic and mitochondrial tRNAs. This posttranscriptional modification is believed to 
stabilize tRNA structure by preventing the formation of alternative tRNA secondary and tertiary 
structures. Yeast tRNA^^^ is unusual in that it does not contain this modification. The modification 
does not occur in eubacteria, presumably because the structure of tRNAs in these cells and organelles 

35 is sequence constrained and does not require posttranscriptional modification to prevent the formation 
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of alternative structures (Steinberg, S. and R. Cedergren (1995) RNA 1:886-891, and references 
within). The enzyme responsible for the conversion of guanosine to m^s^ is a 63 kDa S- 
adenosylmethionine (SAM)-dependent tRNA N^,N^-dimethyl-guanosine methyltransferase (also 
referred to as the TRMl gene product and herein referred to as TRM) (Edqvist, J. (1995) Biochimie 
5 77:54-61). The enzyme localizes to both the nucleus and the noitochondria (Li, J-M. et al. (1989) J. 
Cell Biol. 109:1411-1419). Based on studies with TRM from Xenopus laevis, there appears to be a 
requirement for base pairing at positions CI 1-G24 and G10-C25 immediately preceding the G26 to be 
modified, with other structural features of the tRNA also being reqxiired for the proper presentation of 
the G26 substrate (Edqvist. J. et al. (1992) Nucleic Acids Res. 20:6575-6581). Studies in yeast 
10 suggest that cells carrying a weak ochre tRNA suppressor (sup3-i) are unable to suppress translation 
termination in the absence of TRM activity, suggesting a role for TRM in modifying the frequency of 
suppression in eukaryotic cells (Niederberger, C, et al. (1999) FEBS Lett. 464:67-70), in addition to 
the more general function of ensuring the proper three-dimensional structures for tRNA. 
Translation Initiation 

15 Initiation of translation can be divided into three stages. The first stage brings an initiator 

transfer RNA (Met-tRNAf) together with the 40S ribosomal subunit to form the 43S preinitiation 
complex. The second stage binds the 43S preinitiation complex to the mRNA, followed by migration 
of the complex to the correct AUG initiation codon. The third stage brings the 60S ribosomal subunit 
to the 40S subunit to generate an SOS ribosome at the inititation codon. Regulation of translation 

20 primarily involves the first and second stage in the initiation process (Pain, V.M. (1996) Eur. J. 
Biochem. 236:747-771). 

Several initiation factors, naany of which contain multiple subunits, are involved in bringing 
an initiator tRNA and the 40S ribosomal subunit together. eIF2, a guanine nucleotide binding 
protein, recruits the initiator tRNA to the 40S ribosomal subunit. Only when eIF2 is bound to GTP 

25 does it associate with the initiator tRNA. eIF2B, a guanine nucleotide exchange protein, is 

responsible for converting e]F2 from the GDP-bound inactive form to the GTP-bound active form. 
Two other factors, elFlA and elFB bind and stabilize the 40S subunit by interacting with the 18S 
ribosomal RNA and specific ribosonaal structural proteins. eIF3 is also involved in association of the 
40S ribosomal subunit with mRNA. The Met-tRNAf, elFlA, eIF3, and 40S ribosomal subunit 

30 together make up the 438 preinitiation complex (Pain, supra). 

Additional factors are required for binding of the 438 preinitiation complex to an mRNA 
molecule, and the process is regulated at several levels. eIF4F is a complex consisting of three 
proteins: eIF4E, eIF4A, and eIF4G. eIF4E recognizes and binds to the mRNA 5 -terminal m'^GTP 
cap, eIF4A is a bidirectional RNA-dependent helicase, and eIF4G is a scaffolding polypeptide. 

35 eIF4G has three binding domains. The N-terminal third of eIF4G interacts with eIF4E, the central 
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third interacts with eIF4A, and the C-tenninal third interacts with eIF3 bound to the 43S preinitiation 
complex. Thus, eIF4G acts as a bridge between the 40S ribosomal subunit and the mRNA (Hentze, 
M.W. (1997) Science 275:500-501). 

The ability of eIF4F to initiate binding of the 43S preinitiaition conaplex is regulated by 
5 structural features of the mRNA. The mRNA molecule has an untranslated region (UTR) between 
the 5' cap and the AUG start codon. In some naRNAs this region forms secondary structures that 
impede binding of the 43S preinitiation complex. The helicase activity of eIF4A is thought to 
function in removing this secondary structure to facilitate binding of the 43S preinitiation complex 
(Pain, supra), 

10 The translation of eukaryotic mRNA is a highly competitive and tightly regulated step in gene 

expression. Control of this step is most commonly exerted at the rate-limiting initiation phase. 
Ribosomal proteins involved in translation initiation have been known for some time and their 
biochemical activities were used to build the currently accepted model for cap-dependent initiation of 
translation (Merrick, W. C. et al. (1996) in Translational Control, Hershey, J. W. B. et al. Ed,, Cold 

15 Spring Harbor Laboratory Press, pp. 31-69). According to this model, the 5' cap structure 

(m'^GpppN) attracts the eukaryotic initiation factor 4F (eIF4F) complex to the n3RNA. eIF4F is a 
heteromultimeric complex composed of the cap-binding protein eIF4E, the RNA-dependent ATPase 
eIF4A, and the modular factor eIF4G. The small (40S) ribosomal subunit binds to the 5' end of an 
mRNA as a 43S complex which is thought to unwind secondary structure in the 5' UTR. The 

20 resulting 48S complex then advances through the initiation cycle. A later movement of the 43S 
complex along the mRNA, termed scanning, is the most plausible explanation for a faithful 
recognition of the (usually) first AUG triplet as the start codon. Codon-anticodon base-pairing with 
Met-tRNA* triggers eukaryotic initiation factor 2 (eIF2)-bound GTP hydrolysis, catalysed by 
eukaryotic initiation factor 5 (elF5). It has been thought that this causes dissociation of initiation 

25 factors and the large (60S) subunit joining to form the SOS ribosome. 

The bacterial translation initiation factor, 1F2, is found to be evolutionarily conserved with 
homologs identified in archae, yeasts, mammals, zebrafish, and maize (Choi, S. D. et al. (1998) 
Science 280:1757-1760; Lee, J. H. et al. (1999) Proc. Natl. Acad. Sci. U.S.A 96:4342-4347). Mutant 
strains of Saccharomvces cerevisiae which lack the gene which encodes yeast 1F2 can be used to 

30 demonstrate this evolutionary conservation with respect to IF2 activity. Protein biosynthetic activity 
of translation extracts prepared from such mutant strains can be restored by addition of recombinant 
yIF2 as described in Choi et al. (supra ). Evidence that the biologic activity of these same translation 
extracts can be restored by addition of either human or archeal IF2 (Lee et al. supraX supports the 
idea of universal conservation of IF2 function throughout evolution. 

35 The eukaryotic translation initiation factor 4E (eIF4E) regulates the rate of translation 
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initiation. Overexpression of eIF4E results in rapid cell or tissue proliferation and malignant 
transformation. eIF4E facilitates the synthesis of two powerful tumor angiogenic factors (VEGF and 
FGF-2) by selectively enhancing their translation. e]F4E is overexpressed not only in all head and 
neck squamous cell cancers but also in some dysplastic margins. Tumorigenesis in the head and neck 
5 is proposed to be a multistep process preceded by clinically evident precancerous lesions (Nathan, C- 
A. O. et al. (1999) Laryngoscope 109:1253-1258; De Benedetti, A. and A. L. Harris (1999) Int. J. 
Biochem. Cell Biol. 31:59-72). 

The human eukaryotic protein translation initiation factor, eIF2, binds GTP and Met-tRNAi 
then transfers Met-tRNAi to the 40S ribosomal subxmit in a rate-limiting step in mRNA translation. 

10 One member of this highly conserved, multigene fanmly is the human eIF2Cl gene. This gene has 
been mapped to chromosome Ip34-p35, which is a genomic area often lost in human cancers such as 
Wilms tumors, neuroblastoma, and carcinomas of the breast, liver, and colon (Koesters, R. (1999) 
Genomics 61:210-218). 

Elongation factor 2 (eEF-2) is a 100-kDa protein that catalyzes the ribosomal translocation 

15 reaction, resulting in the movement of ribosomes along noRNA. eEF-2 is the target for a very specific 
Ca^VcalmoduIin-dependent eEF-2 kinase. Phosphorylation of eEF-2 makes it inactive in translation, 
which suggests that protein synthesis can be regulated by Ca^^ through eEF-2 phosphorylation. eEF-2 
phosphorylation therefore regulates the cell-cycle and other processes where changes of intracellular 
Ca^"^ concentration induce a new physiological state of a cell. The main role of eEF-2 

20 phosphorylation in these processes is temporary inhibition of overall translation in response to 

transient elevation of the Ca^"^ concentrations in the cytoplasm. Temporary inhibition of translation 
may trigger the transition of a cell from one physiologic state into another because of the 
disappearance of short-lived repressors and thus the activation of expression of new genes (Ryazanov, 
A. G. and A. S. Spirin (1990) New Biol 2:843-850). 

25 Other ribosomal proteins which modulate translation of mRNA include the retinoblastoma 

protein (Rbl), HIV-1 TAR RNA binding protein (TARBP-b), v-fos transformation effector protein 
(Fte-1), the colin carcinoma laminin-binding protein, the Wilm's tumor-related protein (QM), the 
ribosomal phosphoproteins PO, PI, and P2, ubiquitin, and the Epstein-Barr virus small RNAs- 
associated protein (EAP). 

30 Translation Elongation 

Elongation is the process whereby additional amino acids are joined to the initiator 
methionine to form the complete polypeptide chain. The elongation factors EFl a, EFl b g, and EF2 
are involved in elongating the polypeptide chain following initiation. EFl a is a GTP-binding protein. 
In EFl a's GTP-bound form, it brings an anfiinoacyl-tRNA to the ribosome's A site. The amino acid 

35 attached to the newly arrived aminoacyl-tRNA forms a peptide bond with the initiatior methionine. 
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The GTP on EFl a is hydrolyzed to GDP, and EFl a -GDP dissociates from the ribosome. EFl b g 
binds EFl a -GDP and induces the dissociation of GDP from EFl a, allowing EFl a to bind GTP and 
a new cycle to begin. 

As subsequent aminoacyl-tRNAs are brought to the ribosome, EF-G, another GTP-binding 
5 protein, catalyzes the translocation of tRNAs from the A site to the P site and finally to the E site of 
the ribosome. This allows the ribosome and the niRNA to remain attached during translation. 
Translation Termination 

The release factor eRF carries out termination of translation. eRF recognizes stop codons in 
the noRNA, leading to the release of the polypeptide chain from the ribosome. 
10 Expression profiling 

Microarrays are analytical tools used in bioanalysis. A microarray has a plurality of 
molecules spatially distributed over, and stably associated with, the surface of a solid support. 
Microarrays of polypeptides, polynucleotides, and/or antibodies have been developed and find use in 
a variety of applications, such as gene sequencing, monitoring gene expression, gene naapping, 
15 bacterial identification, drug discovery, and combinatorial chemistry. 

One area in particular in which microarrays find use is in gene expression analysis. Array 
technology can provide a . simple way to explore the expression of a single polymorphic gene or the 
expression profile of a large number of related or unrelated genes. When the expression of a single 
gene is examined, arrays are employed to detect the expression of a specific gene or its variants. 
20 When an expression profile is examined, arrays provide a platform for identifying genes that are 
tissue specific, are affected by a substance being tested in a toxicology assay, are part of a signaling 
cascade, carry out housekeeping functions, or are specifically related to a particular genetic 
predisposition, condition, disease, or disorder. 

There is a need in the art for new compositions, including nucleic acids and proteins, for the 
25 diagnosis, prevention, and treatment of cell proliferative, neurological, developmental, and 
autoirnmune/inflamniatory disorders, and infections. 

SUMMARY OF THE INVENTION 

Various embodiments of the invention provide purified polypeptides, nucleic acid-associated 
30 proteins, referred to collectively as 'TSIAAP" and individually as "NAAP-1," "NAAP-2," "NAAP-3," 
"NAAP-4," "NAAP-5," "NAAP-6," "NAAP-7," "NAAP-8," "NAAP-9;' "NAAP-10," "NAAP-11," 
"NAAP-12," "NAAP-13/' "NAAP-14," "NAAP-15," "NAAP-16," "NAAP-17, "NAAP-18," '^NAAP- 
19," "NAAP-20," "NAAP~21," "NAAP-22," "NAAP-23," "NAAP-24," "NAAP-25," "NAAP-26," 
"NAAP-27," "NAAP-28," "NAAP-29," "NAAP-30," "NAAP-31," "NAAP-32," "NAAP-33," 
35 "NAAP-34," and "NAAP-35," and methods for using these proteins and their encoding 
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polynucleotides for the detection, diagnosis, and treatment of diseases and medical conditions. 
Embodiments also provide methods for utilizing the purified nucleic acid-associated proteins and/or 
their encoding polynucleotides for facilitating the drug discovery process, including detemiination of 
efficacy, dosage, toxicity, and pharmacology. Related embodiments provide methods for utilizing the 
5 purified nucleic acid-associated proteins and/or their encoding polynucleotides for investigating the 
pathogenesis of diseases and medical conditions. 

An embodiment provides an isolated polypeptide selected from the group consisting of a) a 
pol3^eptide comprising an amino acid sequence selected firom the group consisting of SEQ ID NO:l- 
35, b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% identical or at 

10 least about 90% identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO: 1-35, c) a biologically active fragment of a polypeptide having an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-35, and d) an immunogenic fragment of a polypeptide 
having an amino acid sequence selected from the group consisting of SEQ ID NO: 1-35. Another 
embodiment provides an isolated polypeptide comprising an amino acid sequence of SEQ ID NO:l- 

15 35. 

Still another embodiment provides an isolated polynucleotide encoding a polypeptide 
selected from the group consisting of a) a polypeptide comprising an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-35, b) a polypeptide comprising a naturally occurring 
amino acid sequence at least 90% identical or at least about 90% identical to an amino acid sequence 

20 selected from the group consisting of SEQ ID NO: 1-35, c) a biologically active fragment of a 

polypeptide having an annino acid sequence selected from the group consisting of SEQ ID NO: 1-35, 
and d) an immunogenic fragment of a polypeptide having an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-35. In another embodiment, the polynucleotide encodes a 
polypeptide selected from the group consisting of SEQ ID NO: 1-35. In an alternative embodiment, 

25 the polynucleotide is selected from the group consisting of SEQ ID NO:36-70. 

Still another embodiment provides a recombinant polynucleotide comprising a promoter 
sequence operably linked to a polynucleotide encoding a polypeptide selected from the group 
consisting of a) a polypeptide comprising an anriino acid sequence selected from the group consisting 
of SEQ ID NO: 1-35, b) a polypeptide comprising a naturally occurring amino acid sequence at least 

30 90% identical or at least about 90% identical to an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-35, c) a biologically active fragment of a polypeptide having an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-35, and d) an immunogenic 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO: 1-35. Another embodiment provides a cell transformed with the recombinant polynucleotide. 

35 Yet another embodiment provides a transgenic organism comprising the recombinant polynucleotide. 
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Another embodiment provides a method for producing a polypeptide selected from the group 
consisting of a) a polypeptide comprising an amino acid sequence selected from the group consisting 
of SEQ ID NO: 1-35, b) a polypeptide comprising a naturally occurring amino acid sequence at least 
90% identical or at least about 90% identical to an airdno acid sequence selected from the group 
5 consisting of SEQ ID NO: 1-35, c) a biologically active fragment of a polypeptide having an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-35, and d) an immunogenic 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO: 1-35. The method comprises a) culturing a cell under conditions suitable for expression of the 
polypeptide, wherein said cell is transformed with a recombinant polynucleotide comprising a 

10 promoter sequence operably linked to a polynucleotide encoding the polypeptide, and b) recovering 
the polypeptide so expressed. 

Yet another embodiment provides an isolated antibody which specifically binds to a 
polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid 
sequence selected from the group consisting of SEQ ID NO: 1-35, b) a polypeptide comprising a 

15 naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 
amino acid sequence selected from the group consisting of SEQ ID NO: 1-35, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO: 1-35, and d) an immunogenic fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-35. 

20 StiU yet another embodiment provides an isolated polynucleotide selected from the group 

consisting of a) a polynucleotide comprising a polynucleotide sequence selected from the group 
consisting of SEQ ID NO:36-70, b) a polynucleotide comprising a naturally occurring polynucleotide 
sequence at least 90% identical or at least about 90% identical to a polynucleotide sequence selected 
from the group consisting of SEQ ID NO:36-70, c) a polynucleotide complementary to the 

25 polynucleotide of a), d) a polynucleotide complementary to the polynucleotide of b), and e) an RNA 
equivalent of a)-d). In other embodiments, the polynucleotide can comprise at least about 20, 30, 40, 
60, 80, or 100 contiguous nucleotides. 

Yet another embodiment provides a method for detecting a target polynucleotide in a sample, 
said target polynucleotide being selected from the group consisting of a) a polynucleotide comprising 

30 a polynucleotide sequence selected from the group consisting of SEQ ID NO:36~70, b) a 

polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at 
least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:36-70, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide 
complementary to the polynucleotide of b), and e) an RNA equivalent of a)-d). The method 

35 comprises a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 
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comprising a sequence complementary to said target polynucleotide in the sample, and which probe 
specifically hybridizes to said target polynucleotide, under conditions whereby a hybridization 
complex is formed between said probe and said target polynucleotide or fragments thereof, and b) 
detecting the presence or absence of said hybridization complex. In a related embodiment, the 
5 method can include detecting the amount of the hybridization complex. In still other embodiments, 
the probe can comprise at least about 20, 30, 40, 60, 80, or 100 contiguous nucleotides. 

Still yet another embodiment provides a method for detecting a target polynucleotide in a 
sample, said target polynucleotide being selected from the group consisting of a) a polynucleotide 
comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:36-70, b) a 

10 polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at 
least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 
NO:36-70, c) a polynucleotide complementary to the polynucleotide of a), d) a polynucleotide 
complementary to the polynucleotide of b), and e) an RNA equivalent of a)-d). The method 
comprises a) amplifying said target polynucleotide or fragment thereof using polymerase chain 

15 reaction amplification, and b) detecting the presence or absence of said amplified target 

polynucleotide or fragment thereof. In a related embodiment, the method can include detecting the 
amoimt of the amplified target polynucleotide or fragment thereof. 

Another embodiment provides a composition comprising an effective amount of a 
polypeptide selected from the group consisting of a) a polypeptide comprising an amino acid 

20 sequence selected from the group consisting of SEQ ID NO: 1-35, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 
amino acid sequence selected from the group consisting of SEQ ID NO: 1-35, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO: 1-35, and d) an immunogenic fragment of a polypeptide having an amino acid sequence 

25 selected from the group consisting of SEQ ID NO: 1-35, and a pharmaceutically acceptable excipient. 
In one embodiment, the composition can comprise an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-35. Other embodiments provide a method of treating a disease or 
condition associated with decreased or abnormal expression of functional NAAP, comprising 
adnadnistering to a patient in need of such treatment the composition. 

30 Yet another embodiment provides a method for screening a compound for effectiveness as an 

agonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-35, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 
amino acid sequence selected from the group consisting of SEQ ID NO: 1-35, c) a biologically active 

35 fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
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ID NO: 1-35, and d) an immunogenic fragment of a polypeptide having an aininc acid sequence 
selected from the group consisting of SEQ ID NO: 1-35. The method comprises a) exposing a sample 
comprising the polypeptide to a compound, and b) detecting agonist activity in the sample. Another 
embodiment provides a composition comprising an agonist compound identified by the method and a 
5 pharmaceutically acceptable excipient. Yet another embodiment provides a method of treating a 
disease or condition associated with decreased expression of functional NAAP, comprising 
administering to a patient in need of such treatment the composition. 

Still yet another embodiment provides a method for screening a compound for effectiveness 
as an antagonist of a polypeptide selected from the group consisting of a) a polypeptide comprising an 

10 amino acid sequence selected from the group consisting of SEQ ID NO: 1-35, b) a polypeptide 
comprising a naturally occurring amino acid sequence at least 90% identical or at least about 90% 
identical to an amino acid sequence selected from the group consisting of SEQ ID NO: 1-35, c) a 
biologically active fragment of a polypeptide having an amino acid sequence selected from the group 
consisting of SEQ ID NO: 1-35, and d) an immunogenic fragment of a polypeptide having an amino 

15 acid sequence selected from the group consisting of SEQ ID NO: 1-35. The method comprises a) 
exposing a sample comprising the polypeptide to a compound, and b) detecting antagonist activity in 
the sample. Another embodiment provides a composition comprising an antagonist compound 
identified by the method and a pharmaceutically acceptable excipient. Yet another embodiment 
provides a method of treating a disease or condition associated with overexpression of functional 

20 NAAP, comprising administering to a patient in need of such treatment the composition. 

Another embodiment provides a method of screening for a compound that specifically binds 
to a polypeptide selected from the group consisting of a) a polypeptide comprising aa amino acid 
sequence selected from the group consisting of SEQ ID NO: 1-35, b) a polypeptide comprising a 
naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 

25 amino acid sequence selected from the group consisting of SEQ ID NO: 1-35, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO: 1-35, and d) an immunogenic fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-35. The method comprises a) combining the 
polypeptide with at least one test compound under suitable conditions, and b) detecting binding of the 

30 polypeptide to the test compound, thereby identifying a compoimd that specifically binds to the 
polypeptide. 

Yet another embodiment provides a method of screening for a compound that modulates the 
activity of a polypeptide selected from the group consisting of a) a polypeptide comprising an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-35, b) a polypeptide comprising a 
35 naturally occurring amino acid sequence at least 90% identical or at least about 90% identical to an 
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aniino acid sequence selected from the group consisting of SEQ ID NO: 1-35, c) a biologically active 
fragment of a polypeptide having an amino acid sequence selected from the group consisting of SEQ 
ID NO: 1-35, and d) an immunogenic fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-35. The method comprises a) combining the 
5 polypeptide with at least one test compound under conditions permissive for the activity of the 

polypeptide, b) assessing the activity of the polypeptide in the presence of the test compound, and c) 
comparing the activity of the polypeptide in the presence of the test compound with the activity of the 
polypeptide in the absence of the test compound, wherein a change in the activity of the polypeptide 
in the presence of the test compound is indicative of a compound that modulates the activity of the 
10 polypeptide. 

Still yet another embodiment provides a method for screening a compound for effectiveness 
in altering expression of a target polynucleotide, wherein said target polynucleotide comprises a 
polynucleotide sequence selected from the group consisting of SEQ ID NO:36-70, the method 
comprising a) exposing a sample comprising the target polynucleotide to a compound, b) detecting 
15 altered expression of the target polynucleotide, and c) comparing the expression of the target 
polynucleotide in the presence of varying amounts of the compound and in the absence of the 
compound. 

Another embodiment provides a method for assessing toxicity of a test compound, said 
method comprising a) treating a biological sample containing nucleic acids with the test compound; 

20 b) hybridizing the nucleic acids of the treated biological sample with a probe comprising at least 20 
contiguous nucleotides of a polynucleotide selected from the group consisting of i) a polynucleotide 
comprising a polynucleotide sequence selected from the group consisting of SEQ ID NO:36-70, ii) a 
polynucleotide comprising a naturally occurring polynucleotide sequence at least 90% identical or at 
least about 90% identical to a polynucleotide sequence selected from the group consisting of SEQ ID 

25 NO: 36-70, iii) a polynucleotide having a sequence complementary to i), iv) a polynucleotide 

complementary to the polynucleotide of ii), and v) an RNA equivalent of i)-iv). Hybridization occurs 
under conditions whereby a specific hybridization complex is formed between said probe and a target 
polynucleotide in the biological sample, said target polynucleotide selected from the group consisting 
of i) a polynucleotide comprising a polynucleotide sequence selected from the group consisting of 

30 SEQ ID NO: 36-70, ii) a polynucleotide comprising a naturally occurring polynucleotide sequence at, 
least 90% identical or at least about 90% identical to a polynucleotide sequence selected from the 
group consisting of SEQ ID NO:36-70, iii) a polynucleotide complementary to the polynucleotide of 
i), iv) a polynucleotide complementary to the polynucleotide of ii), and v) an RNA equivalent of i)- 
iv). Alternatively, the target polynucleotide can comprise a fragment of a polynucleotide selected 

35 from the group consisting of i)-v) above; c) quantifying the amount of hybridization complex; and d) 
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comparing the amount of hybridization complex in the treated biological sample with the amount of 
hybridization complex in an untreated biological sample, wherein a difference in the amount of 
hybridization complex in the treated biological sample is indicative of toxicity of the test compound. 

5 BRIEF DESCRIPTION OF THE TABLES 

Table 1 summarizes the nomenclature for fiill length polynucleotide and polypeptide 
embodiments of the invention. 

Table 2 shows the GeiiBank identification number and annotation of the nearest GenBank 
homolog for polypeptide embodiments of the invention. The probability scores for the matches 
10 between each polypeptide and its homolog(s) are also shown. 

Table 3 shows structural features of polypeptide embodiments, including predicted motifs 
and domains, along with the methods, algorithms, and searchable databases used for analysis of the 
polypeptides. 

Table 4 lists the cDNA and/or genomic DNA fragments which were used to assemble 
15 polynucleotide embodiments, along with selected fragments of the polynucleotides. 

Table 5 shows representative cDNA libraries for polynucleotide embodiments. 

Table 6 provides an appendix which describes the tissues and vectors used for construction of 
the cDNA libraries shown in Table 5. 

Table 7 shows the tools, programs, and algorithms used to analyze polynucleotides and 
20 polypeptides, along with applicable descriptions, references, and threshold parameters. 

DESCRIPTION OF THE INVENTION 

Before the present proteins, nucleic acids, and methods are described, it is understood that 
embodiments of the invention are not limited to the particular machines, instruments, materials, and 
25 methods described, as these may vary. It is also to be understood that the terminology used herein is 
for the purpose of describing particular embodiments only, and is not intended to limit the scope of 
the invention. 

As used herein and in the appended claims, the singular forms "a," "an," and "the" include 
plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to "a 
30 host cell" includes a plurality of such host cells, and a reference to "an antibody" is a reference to one 
or more antibodies and equivalents thereof known to those skilled in the art, and so forth. 

Unless defined otherwise, all technical and scientific terms used herein have the same 
meanings as commonly understood by one of ordinary skill in the art to which this invention belongs. 
Although any machines, materials, and methods similar or equivalent to those described herein can be 
35 used to practice or test the present invention, the preferred machines, materials and methods are now 
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described. All publications mentioned herein are cited for the purpose of describing and disclosing 
the cell lines, protocols, reagents and vectors which are reported in the publications and which might 
be used in connection with various embodiments of the invention* Nothing herein is to be construed 
as an admission that the invention is not entitled to antedate such disclosure by virtue of prior 
5 invention. 

DEFINITIONS 

"NAAP" refers to the amino acid sequences of substantially purified NAAP obtained from 
any species, particularly a mammalian species, including bovine, ovine, porcine, murine, equine, and 
human, and from any source, whether natural, synthetic, semi-synthetic, or recombinant. 

10 The term "agonist" refers to a molecule which intensifies or noimics the biological activity of 

NAAP. Agonists may include proteins, nucleic acids, carbohydrates, small molecules, or any other 
compound or composition which modulates the activity of NAAP either by directly interacting with 
NAAP or by acting on components of the biological pathway in which NAAP participates. 

An "allelic variant" is an altemative form of the gene encoding NAAP. Allelic variants may 

15 result from at least one mutation in the nucleic acid sequence and may result in altered mRNAs or in 
polypeptides whose structure or function may or may not be altered. A gene may have hone, one, or 
many allelic variants of its naturally occurring form. Common mutational changes which give rise to 
allelic variants are generally ascribed to natural deletions, additions, or substitutions of nucleotides. 
Each of these t3rpes of changes may occur alone, or in combination with the others, one or more times 

20 in a given sequence. 

"Altered" nucleic acid sequences encoding NAAP include those sequences with deletions, 
insertions, or substitutions of different nucleotides, resulting in a polypeptide the same as NAAP or a 
polypeptide with at least one functional characteristic of NAAP. Included within this definition are 
polymorphisms which may or may not be readily detectable using a particular oligonucleotide probe 

25 of the polynucleotide encoding NAAP, and improper or unexpected hybridization to allelic variants, 
with a locus other than the normal chromosomal locus for the polynucleotide encoding NAAP. The 
encoded protein may also be "altered," and may contain deletions, insertions, or substitutions of 
amino acid residues which produce a silent change and result in a functionally equivalent NAAP. 
Deliberate amino acid substitutions may be made on the basis of one or more similarities in polarity, 

30 charge, solubility, hydrophobicity, hydrophilicity, and/or the amphipathic nature of the residues, as 
long as the biological or immunological activity of NAAP is retained. For example, negatively 
charged amino acids may include aspartic acid and glutamic acid, and positively charged amino acids 
may include lysine and arginine. Amino acids with uncharged polar side chains having similar 
hydrophilicity values may include: asparagine and glutamine; and serine and threonine. Amino acids 

35 with uncharged side chains having similar hydrophilicity values may include: leucine, isoleucine, and 
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valine; glycine and alanine; and phenylalanine and tyrosine. 

The terms "amino acid" and "amino acid sequence" can refer to an oligopeptide, a peptide, a 
polypeptide, or a protein sequence, or a fragment of any of these, and to naturally occurring or 
synthetic molecules. Where "amino acid sequence" is recited to refer to a sequence of a naturally 
5 occurring protein molecule, "amino acid sequence" and like terms are not meant to limit the amino 
acid sequence to the complete native amino acid sequence associated with the recited protein 
molecule. 

"Amplification" relates to the production of additional copies of a nucleic acid. 
Amplification may be carried out using polymerase chain reaction (PGR) technologies or other 

10 nucleic acid amplification technologies well known in the art. 

The term "antagonist" refers to a molecule which inhibits or attenuates the biological activity 
of NAAP. Antagonists may include proteins such as antibodies, anticalins, nucleic acids, 
carbohydrates, small molecules, or any other compound or composition which modulates the activity 
of NAAP either by directly interacting with NAAP or by acting on components of the biological 

15 pathway in which NAAP participates. 

The term "antibody" refers to intact immunoglobulin molecules as well as to fragments 
thereof, such as Fab, F(ab')2» and Fv fragments, which are capable of binding an epitopic determinant. 
Antibodies that bind NAAP polypeptides can be prepared using intact polypeptides or using 
&"agments containing small peptides of interest as the immunizing antigen. The polypeptide or 

20 oligopeptide used to immunize an animal (e.g., a mouse, a rat, or a rabbit) can be derived from the 
translation of RNA, or synthesized chemically, and can be conjugated to a carrier protein if desired. 
Commonly used carriers that are chemically coupled to peptides include bovine serum albumin, 
thyroglobulin, and keyhole limpet hemocyanin (KLH). The coupled peptide is then used to immunize 
the animal. 

25 The term "antigenic determinant" refers to that region of a molecule (i.e., an epitope) that 

makes contact with a particular antibody. When a protein or a fragment of a protein is used to 
immunize a host animal, numerous regions of the protein may induce the production of antibodies 
which bind specifically to antigenic determinants (particular regions or three-dimensional structures 
on the protein). An antigenic detenninant may compete with the intact antigen (i.e., the immunogen 

30 used to elicit the immune response) for binding to an antibody. 

The term "aptamer" refers to a nucleic acid or oligonucleotide molecule that binds to a 
specific molecular target. Aptamers are derived from an in vitro evolutionary process (e.g., SELEX 
(Systematic Evolution of Ligands by Exponential Enrichment), described in U.S. Patent No. 
5,270,163), which selects for target-specific aptamer sequences from large combinatorial libraries. 

35 Aptamer compositions may be double-stranded or single-stranded, and may include 
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deoxyribonucleotides, ribonucleotides, nucleotide derivatives, or other nucleotide-like molecules. 
The nucleotide components of an aptamer may have modified sugar groups (e.g., the 2 -OH group of a 
ribonucleotide naay be replaced by 2 -F or 2'-NH2), which may improve a desired property, e.g., 
resistance to nucleases or longer lifetime in blood. Aptamers naay be conjugated to other molecules, 
5 e.g., a high molecular weight carrier to slow clearance of the aptamer from the circulatory system. 
Aptamers may be specifically cross-linked to their cognate ligands, e.g., by photo-activation of a 
cross-linker (Brody, E.N. and L. Gold (2000) J. Biotechnol. 74:5-13). 

The term "intramer" refers to an aptamer which is expressed in vivo. For example, a vaccinia 
virus-based KNA expression system has been used to express specific KNA aptamers at high levels in 

10 the cytoplasm of leukocytes (Blind, M. et al. (1999) Proc. Natl. Acad. Sci. USA 96:3606-3610). 

The term "spiegelmer" refers to sax aptamer which includes L-DNA, L-RNA, or other left- 
handed nucleotide derivatives or nucleotide-like molecules. Aptamers containing left-handed 
nucleotides are resistant to degradation by naturally occurring enzymes, which normally act on 
substrates containing right-handed nucleotides. 

15 The term "antisense" refers to any composition capable of base-pairing with the "sense" 

(coding) strand of a polynucleotide having a specific nucleic acid sequence. Antisense compositions 
may include DNA; RNA; peptide nucleic acid (PNA); oligonucleotides having modified backbone 
linkages such as phosphorothioates, methylphosphonates, or benzylphosphonates; oligonucleotides 
having modified sugar groups such as 2 -methoxyethyl sugars or 2 -methoxyethoxy sugars; or 

20 oligonucleotides having modified bases such as 5-methyl cytosine, 2'-deoxyuracil, or 7-deaza-2 - 

deoxyguanosine. Antisense molecules may be produced by any method including chemical synthesis 
or transcription. Once introduced into a cell, the complementary antisense molecule base-pairs with a 
naturally occurring nucleic acid sequence produced by the cell to form duplexes which block either 
transcription or translation. The designation "negative" or "nodnus" can refer to the antisense strand, 

25 and the designation "positive" or "plus" can refer to the sense strand of a reference DNA molecule. 

The term "biologically active" refers to a protein having structural, regulatory, or biochemical 
functions of a naturally occurring molecule. Likewise, "immunologically active" or "immunogenic" 
refers to the capability of the natural, recombinant, or synthetic NAAP, or of any oligopeptide 
thereof, to induce a specific immune response in appropriate animals or cells and to bind with specific 

30 antibodies. 

"Complementary" describes the relationship between two single-stranded nucleic acid 
sequences that anneal by base-pairing. For example, 5 -AGT-3' pairs with its complement, 
3'-TCA-5'. 

A "composition comprising a given polynucleotide" and a "composition comprising a given 
35 polypeptide" can refer to any composition containing the given polynucleotide or polypeptide. The 
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composition may comprise a dry formulation or an aqueous solution. Compositions comprising 
polynucleotides encoding NAAP or fragments of NAAP may be employed as hybridization probes. 
The probes may be stored in freeze-dried form and may be associated with a stabilizing agent such as 
a carbohydrate. In hybridizations, the probe may be deployed in an aqueous solution containing salts 

5 (e.g., NaQ), detergents (e.g., sodium dodecyl sulfate; SDS), and other components (e.g., Denhardt's 
solution, dry milk, salmon sperm DNA, etc.). 

"Consensus sequence" refers to a nucleic acid sequence which has been subjected to repeated 
DNA sequence analysis to resolve uncalled bases, extended using the XL-PCR kit (Applied 
Biosystems, Foster City CA) in the 5' and/or the 3* direction, and resequenced, or which has been 

10 assembled from one or more overlapping cDNA, EST, or genomic DNA fragments using a computer 
program for fragment assembly, such as the GELVIEW fragment assembly system (GCG, Madison 
WI) or Phrap (University of Washington, Seattle WA). Some sequences have been both extended and 
assembled to produce the consensus sequence. 

"Conservative anoino acid substitutions" are those substitutions that are predicted to least 

15 interfere with the properties of the original protein, i.e., the structure and especially the function of 
the protein is conserved and not significantly changed by such substitutions. The table below shows 
amino acids which may be substituted for an original amino acid in a protein and which are regarded 
as conservative amino acid substitutions. 





Original Residue 


Conservative Substitution 


20 


Ala 


Gly, Ser 




Arg 


His, Lys 




Asn 


Asp, Gin, His 




Asp 


Asn, Glu 




Cys 


Ala, Ser 


25 


Gin 


Asn, Glu, His 




Glu 


Asp, Gin, His 




Gly 


Ala 




His 


Asn, Arg, Gin, Glu 




He 


Leu, Val 


30 


Leu 


He, Val 




Lys 


Arg, Gin, Glu 




Met 


Leu, He 




Phe 


His, Met, Leu, Trp, Tyr 




Ser 


Cys, Thr 


35 


Thr 


Ser, Val 




Trp 


Phe, Tyr 




Tyr 


His, Phe, Trp 




Val 


He, Leu, Thr 



40 Conservative amino acid substitutions generally maintain (a) the structure of the polypeptide 

backbone in the area of the substitution, for example, as a beta sheet or alpha helical conformation, 
(b) the charge or hydrophobicity of the molecule at the, site of the substitution, and/or (c) the bulk of 
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the side chain. 

A "deletion" refers to a change in the amino acid or nucleotide sequence that results in the 
absence of one or more amino acid residues or nucleotides. 

The term "derivative" refers to a chemically modified polynucleotide or polypeptide. 
5 Chemical modifications of a polynucleotide can include, for example, replacement of hydrogen by an 
altyl, acyl, hydroxyl, or amino group. A derivative polynucleotide encodes a polypeptide which 
retains at least one biological or immunological function of the natural molecule. A derivative 
polypeptide is one modified by glycosylation, pegylation, or any similar process that retains at least 
one biological or immunological function of the polypeptide from which it was derived. 

10 A "detectable label" refers to a reporter molecule or enzjntne that is capable of generating a 

measurable signal and is covalentiy or noncovalentiy joined to a polynucleotide or polypeptide. 

"Differential expression" refers to increased or upregulated; or decreased, downregulated, or 
absent gene or protein expression, determined by comparing at least two different samples. Such 
comparisons may be carried out between, for example, a treated and an untreated sample, or a 

15 diseased and a normal sample. 

"Exon shuffling" refers to the recombination of different coding regions (exons). Since an 
exon may represent a structural or functional domain of the encoded protein, new proteins may be 
assembled through the novel reassortment of stable substructures, thus allowing acceleration of the 
evolution of new protein functions. 

20 A "fragment" is a unique portion of NAAP or a polynucleotide encoding NAAP which can be 

identical in sequence to, but shorter in length than, the parent sequence. A fragment may comprise up 
to the entire length of the defined sequence, minus one nucleotide/amino acid residue. For example, a 
fragment may comprise fi*om about 5 to about 1000 contiguous nucleotides or amino acid residues. A 
fragment used as a probe, primer, antigen, therapeutic molecule, or for other purposes, may be at least 

25 5, 10, 15, 16, 20, 25, 30, 40, 50, 60, 75, 100, 150, 250 or at least 500 contiguous nucleotides or amino 
acid residues in length. Fragments may be preferentially selected from certain regions of a molecule. 
For example, a polypeptide fragment may comprise a certain length of contiguous amino acids 
selected from the first 250 or 500 amino acids (or first 25% or 50%) of a polypeptide as shown in a 
certain defined sequence. Clearly these lengths are exemplary, and any length that is supported by 

30 the specification, including the Sequence Listing, tables, and figures, may be encompassed by the 
present embodiments. 

A fragment of SEQ ID NO: 36-70 can comprise a region of unique polynucleotide sequence 
that specifically identifies SEQ ID NO:36-70, for example, as distinct from any other sequence in the 
genome from which the fragment was obtained. A fragment of SEQ ID NO:36-70 can be employed 
35 in one or more embodiments of methods of the invention, for example, in hybridization and 
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amplification technologies and in analogous methods that distinguish SEQ ID NO:36-70 from related 
polynucleotides. The precise length of a fragment of SEQ ID NO:36-70 and the region of SEQ ID 
NO:36-70 to which the fragment corresponds are routinely determinable by one of ordinary skill in 
the art based on the intended purpose for the fragment. 
5 A fragment of SEQ ID NO: 1-35 is encoded by a fragment of SEQ ID NO:36-70. A fragment 

of SEQ ID NO: 1-35 can comprise a region of unique amino acid sequence that specifically identifies 
SEQ ID NO: 1-35. For example; a fragment of SEQ ID NO: 1-35 can be used as an immunogenic 
peptide for the development of antibodies that specifically recognize SEQ ID NO: 1-35. The precise 
length of a fragment of SEQ ID NO: 1-35 and the region of SEQ ID NO: 1-35 to which the fragment 

10 corresponds can be deteraadned based on the intended purpose for the fragment using one or more 
analytical methods described herein or otherwise known in the art. 

A "full length" polynucleotide is one containing at least a translation initiation codon (e.g., 
methionine) followed by an open reading frame and a translation termination codon. A '*full length" 
polynucleotide sequence encodes a "full length" polypeptide sequence. 

15 "Homology" refers to sequence sinoilarity or, interchangeably, sequence identity, between 

two or more polynucleotide sequences or two or more polypeptide sequences. 

The terms "percent identity" and "% identity," as applied to polynucleotide sequences, refer 
to the percentage of residue matches between at least two polynucleotide sequences aligned using a 
standardized algorithm. Such an algorithm may insert, in a standardized and reproducible way, gaps 

20 in the sequences being compared in order to optimize alignment between two sequences, and 
therefore achieve a more meaningful comparison of the two sequences. 

Percent identity between polynucleotide sequences may be determined using one or more 
computer algoritlims or programs known in the art or described herein. For example, percent identity 
can be determined using the default parameters of the GLUSTAL V algorithm as incorporated into 

25 the MEGALIGN version 3. 12e sequence alignment program. This program is part of the 

LASERGENE software package, a suite of molecular biological analysis programs (DNASTAR, 
Madison WI). CLUSTAL V is described in ffiggins, D.G. and P.M. Sharp (1989; CABIOS 5:151- 
153) and in ffiggins, D.G. et al (1992; CABIOS 8:189-191). For pairwise alignments of 
polynucleotide sequences, the default parameters are set as follows: Ktuple=2, gap penalty=5, 

30 window=4, and "diagonals saved"=4. The "weighted" residue weight table is selected as the default. 
Percent identity is reported by CLUSTAL V as the "percent similarity" between aligned 
polynucleotide sequences. 

Alternatively, a suite of conomonly used and freely available sequence comparison algorithms 
which can be used is provided by the National Center for Biotechnology Information (NCBI) Basic 

35 Local AUgnment Search Tool (BLAST) (Altschul, S.F. et al. (1990) J. Mol. Biol. 215:403-410), 
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which is available from several sources, including the NCBI, Bethesda, MD, and on the Internet at 
http://www.ncbi,nlm.nih.gov/BLAST/. The BLAST software suite includes various sequence 
analysis programs including 'iDlastn," that is used to align a known polynucleotide sequence with 
other polynucleotide sequences from a variety of databases. Also available is a tool called ''BLAST 2 
5 Sequences" that is used for direct pairwise comparison of two nucleotide sequences. "BLAST 2 
Sequences" can be accessed and used interactively at http://www.ncbi.nlm.nih.gov/gorf/bl2.html. 
The "BLAST 2 Sequences" tool can be used for both blastn and blastp (discussed below). BLAST 
programs are conomonly used with gap and other parameters set to default settings. For example, to 
compare two nucleotide sequences, one may use blastn with the "BLAST 2 Sequences" tool Version 
10 2.0. 12 (April-21-2000) set at default parameters. Such default parameters may be, for example: 

Matrix: BLOSUM62 

Reward for match: 1 

Penalty for mismatch: -2 

Open Gap: 5 and Extension Gap: 2 penalties 
15 Gap X drop-off: 50 

Expect: 10 

Word Size: 11 

Filter: on 

Percent identity may be measured over the length of an entire defined sequence, for example, 

20 as defined by a particular SEQ ID number, or may be measured over a shorter lengdi, for example, 
over the length of a fragment taken from a larger, defined sequence, for instance, a fragment of at 
least 20, at least 30, at least 40, at least 50, at least 70, at least 100, or at least 200 contiguous 
nucleotides. Such lengths are exemplary only, and it is understood that any fragment length 
supported by the sequences shown herein, in the tables, figures, or Sequence Listing, may be used to 

25 describe a length over which percentage identity may be measured. 

Nucleic acid sequences that do not show a high degree of identity may nevertheless encode 
similar amino acid sequences due to the degeneracy of the genetic code. It is imderstood that changes 
in a nucleic acid sequence can be made using this degeneracy to produce multiple nucleic acid 
sequences that all encode substantially the same protein. 

30 The phrases "percent identity" and "% identity," as applied to polypeptide sequences, refer to 

the percentage of residue matches between at least two polypeptide sequences aligned using a 
standardized algorithm. Methods of polypeptide sequence alignment are well-known. Some 
alignment methods take into account conservative amino acid substitutions. Such conservative 
substitutions, explained in more detail above, generally preserve the charge and_hydrophobicity at the 

35 site of substitution, thus preserving the structure (and therefore function) of the polypeptide. 
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Percent identity between polypeptide sequences may be determined using the default 
parameters of the CLUSTAL V algorithm as incorporated into the MEGALIGN version 3.12e 
sequence alignment program (described and referenced above). For pairwise aligimients of 
polypeptide sequences using CLUSTAL V, the default parameters are set as follows: Ktuple=l, gap 
5 penalty=3, window=5, and "diagonals saved"=5. The PAM250 matrix is selected as the default 
residue weight table. As with polynucleotide alignments, the percent identity is reported by 
CLUSTAL V as the "percent similarity'* between aligned polypeptide sequence pairs. 

Alternatively the NCBI BLAST software suite may be used. For example, for a pairwise 
comparison of two polypeptide sequences, one may use the "BLAST 2 Sequences" tool Version 
10 2.0. 12 (April-21-2000) with blastp set at default parameters. Such default parameters nnay be, for 
example; 

Matrix: BLOSUM62 

Open Gap: 11 and Extension Gap: 1 penalties 
Gap X drop'Ojf: 50 
15 Expect: 10 

Word Size: 3 
Filter: on 

Percent identity may be measured over the length of an entire defined polypeptide sequence, 
for example, as defined by a particular SEQ ID number, or may be measured over a shorter length, for 

20 example, over the length of a fragment taken fi:om a larger, defined polypeptide sequence, for 

instance, a fragment of at least 15, at least 20, at least 30, at least 40, at least 50, at least 70 or at least 
150 contiguous residues. Such lengths are exemplary only, and it is understood that any fragment 
length supported by the sequences shown herein, in the tables, figures or Sequence Listing, may be 
used to describe a length over which percentage identity may be measured. 

25 "Human artificial chromosomes" (HACs) are linear microchromosomes which may contain 

DNA sequences of about 6 kb to 10 Mb in size and which contain all of the elements required for 
chromosome replication, segregation and maintenance. 

The term "humanized antibody" refers to an antibody molecule in which the amino acid 
sequence in the non-antigen binding regions has been altered so that the antibody more closely 

30 resembles a human antibody, and still retains its original binding ability. 

"Hybridization" refers to the process by which a polynucleotide strand anneals with a 
complementary strand through base pairing under defined hybridization conditions. Specific 
hybridization is an indication that two nucleic acid sequences share a high degree of complementarity. 
Specific hybridization complexes form under permissive annealing conditions and remain hybridized 

35 after the "washing" step(s). The washing step(s) is particularly important in determining the 
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Stringency of the hybridization process, with more stringent conditions allowing less non-specific 
binding, i.e., binding between pairs of nucleic acid strands that are not perfectly matched. Permissive 
conditions for annealing of nucleic acid sequences are routinely deteraiinable by one of ordinary skill 
in the art and may be consistent among hybridization experiments, whereas wash conditions may be 
5 varied among experiments to achieve the desired stringency, and therefore hybridization specificity. 
Permissive annealing conditions occur, for example, at 68°C in the presence of about 6 x SSC, about 
1% (w/v) SDS, and about 100 jttg/ml sheared, denatured salmon sperm DNA. 

Generally, stringency of hybridization is expressed, in part, with reference to the temperature 
under which the wash step is carried out. Such wash temperatures are typically selected to be about 

10 5°C to 20°C lower than the thermal melting point (T^) for the specific sequence at a defined ionic 
strength and pH. The is the temperature (under defined ionic strength and pH) at which 50% of 
the target sequence hybridizes to a perfectly matched probe. An equation for calculating and 
conditions for nucleic acid hybridization are well known and can be found in Sambrook, J. et al. 
(1989) Molecular Cloning: A Laboratory Manual , T"^ ed., vol. 1-3, Cold Spring Harbor Press, 

15 Plainview NY; specifically see volume 2, chapter 9. 

High stringency conditions for hybridization between polynucleotides of the present 
invention include wash conditions of 68°C in the presence of about 0.2 x SSC and about 0.1% SDS, 
for 1 hour. Alternatively, temperatures of about 65 °C, 60°C, 55°C, or 42°C may be used. SSC 
concentration may be varied from about 0.1 to 2 x SSC, with SDS being present at about 0.1%. 

20 Typically, blocking reagents are used to block non-specific hybridization. Such blocking reagents 
include, for instance, sheared and denatured salmon sperm DNA at about 100-200 jitg/ml. Organic 
solvent, such as formamide at a concentration of about 35-50% v/v, may also be used under particular 
circumstances, such as for RNArDNA hybridizations. Useful variations on these wash conditions 
will be readily apparent to those of ordinary skill in the art. Hybridization, particularly under high 

25 stringency conditions, may be suggestive of evolutionary similarity between the nucleotides. Such 
similarity is strongly indicative of a similar role for the nucleotides and their encoded polypeptides. 

The term "hybridization complex" refers to a complex formed between two nucleic acids by 
virtue of the formation of hydrogen bonds between complementary bases. A hybridization complex 
may be formed in solution (e.g., Cgt or Rot analysis) or formed between one nucleic acid present in 

30 solution and another nucleic acid immobilized on a solid support (e.g., paper, membranes, filters, 

chips, pins or glass slides, or any other appropriate substrate to which cells or their nucleic acids have 
been fixed). 

The words "insertion" and "addition" refer to changes in an amino acid or polynucleotide 
sequence resulting in the addition of one or more amino acid residues or nucleotides, respectively. 
35 "Immune response" can refer to conditions associated with inflammation, trauma, inamune 
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disorders, or infectious or genetic disease, etc. These conditions can be characterized by expression 
of various factors, e.g., cytokines, chemokines, and other signaling molecules, which may affect 
cellular and systemic defense systems. 

An "immunogenic fragment" is a polypeptide or oligopeptide fragment of NAAP which is 
5 capable of elicitmg an immune response when introduced into a living organism, for example, a 

mammal. The term "immunogenic fragment" also includes any polypeptide or oligopeptide fragment 
of NAAP which is useful in any of the antibody production methods disclosed herein or known in the 
art. 

The term "naicroarray" refers to an arrangement of a plurality of polynucleotides, 

10 polypeptides, antibodies, or other chemical compounds on a substrate. 

The tenxis "element" and "array element" refer to a polynucleotide, polypeptide, antibody, or 
other chemical compound having a unique and defined position on a microarray. 

The term "modulate" refers to a change in the activity of NAAP. For example, modulation 
may cause an increase or a decrease in protein activity, binding characteristics, or any other 

15 biological, functional, or immunological properties of NAAP. 

The phrases "nucleic acid" and "nucleic acid sequence" refer to a nucleotide, oligonucleotide, 
polynucleotide, or any fragment thereof. These phrases also refer to DNA or RNA of genomic or 
synthetic origin which may be single-stranded or double-stranded and may represent the sense or the 
antisense strand, to peptide nucleic acid (PNA), or to any DNA-like or RNA-like material. 

20 "Operably linked" refers to the situation in which a first nucleic acid sequence is placed in a 

functional relationship with a second nucleic acid sequence. For instance, a promoter is operably 
linked to a coding sequence if the promoter affects the transcription or expression of the coding 
sequence. Operably linked DNA sequences may be in close proximity or contiguous and, where 
necessary to join two protein coding regions, in the same reading frame. 

25 "Peptide nucleic acid" (PNA) refers to an antisense molecule or anti-gene agent which 

comprises an oligonucleotide of at least about 5 nucleotides in length linked to a peptide backbone of 
amino acid residues ending in lysine. The terminal lysine confers solubility to the composition. 
PNAs preferentially bind complementary single stranded DNA or RNA and stop transcript 
elongation, and may be pegylated to extend their lifespan in the cell. 

30 "Post-translational modification" of an NAAP may involve lipidation, glycosylation, 

phosphorylation, acetylation, racemization, proteolytic cleavage, and other modifications known in 
the art. These processes may occur synthetically or biochemically. Biochemical modifications will 
vary by cell type depending on the enzymatic milieu of NAAP. 

"Probe" refers to nucleic acids encoding NAAP, their complements, or fragments thereof, 

35 which are used to detect identical, allelic or related nucleic acids. Probes are isolated 



40 



wo 03/006618 



PCT/US02/21971 



oligonucleotides or polynucleotides attached to a detectable label or reporter molecule. Typical 
labels include radioactive isotopes, ligands, chemiluminescent agents, and enzymes. "Primers" are 
short nucleic acids, usually DNA oligonucleotides, which may be aimealed to a target polynucleotide 
by complementary base-pairing. The primer may then be extended along the target DNA strand by a 
5 DNA polymerase enzyme. Primer pairs can be used for amplification (and identification) of a nucleic 
acid, e.g., by the polymerase chain reaction (PGR). 

Probes and primers as used in the present invention typically comprise at least 15 contiguous 
nucleotides of a known sequence. In order to erJiance specificity, longer probes and primers may also 
be employed, such as probes and prhners that comprise at least 20, 25, 30, 40, 50, 60, 70, 80, 90, 100, 

10 or at least 150 consecutive nucleotides of the disclosed nucleic acid sequences. Probes and primers 
may be considerably longer than these examples, and it is understood that any length supported by the 
specification, including the tables, figures, and Sequence Listing, may be used. 

Methods for preparing and using probes and primers are described in the references, for 
example Sambrook, J. et al. (1989; Molecular Cloning: A Laboratory Manual , 2"*^ ed., vol. 1-3, Cold 

15 Spring Harbor Press, Plainview NY), Ausubel, RM. et al. (1999) Short Protocols in Molecular 
Biologv , 4*^ ed., John Wiley & Sons, New York NY), and Ihnis, M. et al. (1990; PGR Protocols, A 
Guide to Methods and Applications , Academic Press, San Diego CA). PGR primer pairs can be 
derived from a known sequence, for example, by using computer programs intended for that purpose 
such as Primer (Version 0.5, 1991, Whitehead Institute for Biomedical Research, Cambridge MA). 

20 Oligonucleotides for use as primers are selected using software known in the art for such 

purpose. For example, OLIGO 4.06 software is useful for the selection of PGR primer pairs of up to 
100 nucleotides each, and for the analysis of oligonucleotides and larger polynucleotides of up to 
5,000 nucleotides from an input polynucleotide sequence of up to 32 kilobases. Similar primer 
selection programs have incorporated additional features for expanded capabilities. For example, the 

25 PrimOU primer selection program (available to the public from the Genome Center at University of 
Texas South West Medical Center, Dallas TX) is capable of choosing specific primers from 
megabase sequences and is thus useful for designing primers on a genome-wide scope. The PrimerS 
primer selection program (available to the public from the Whitehead Institute/MIT Center for 
Genome Research, Cambridge MA) allows the user to input a "mispriming library," in which 

30 sequences to avoid as primer binding sites are user-specified. Primer3 is useful, in particular, for the 
selection of oligonucleotides for microarrays. (The source code for the latter two primer selection 
programs may also be obtained from their respective sources and modified to meet the user's specific 
needs.) The PrimeGen program (available to the public from the UK Human Genome Mapping 
Project Resource Centre, Cambridge UK) designs primers based on multiple sequence alignments, 

35 thereby allowing selection of primers that hybridize to either the most conserved or least conserved 
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regions of aligned nucleic acid sequences. Hence, this program is useful for identification of both 
unique and conserved oligonucleotides and polynucleotide fragnaents. The oligonucleotides and 
polynucleotide fragments identified by any of the above selection methods are useful in hybridization 
technologies, for example, as PGR or sequencing primers, microarray elements, or specific probes to 
5 identify fiiUy or partially complementary polynucleotides in a sample of nucleic acids. Methods of 
oligonucleotide selection are not limited to those described above. 

A "recombinant nucleic acid" is a nucleic acid that is not naturally occurring or has a 
sequence that is made by an artificial combination of two or more otherwise separated segments of 
sequence. This artificial combination is often accomplished by chemical synthesis or, more 

10 commonly, by the artificial manipulation of isolated segments of nucleic acids, e.g., by genetic 

engineering techniques such as those described in Sambrook, supra. The term recombinant includes 
nucleic acids that have been altered solely by addition, substitution, or deletion of a portion of the 
nucleic acid. Frequently, a recombinant nucleic acid may include a nucleic acid sequence operably 
linked to a promoter sequence. Such a recombinant nucleic acid may be part of a vector that is used, 

15 for example, to transform a cell. 

Alternatively, such recombinant nucleic acids may be part of a viral vector, e.g., based on a 
vaccinia virus, that could be use to vaccinate a mammal wherein the recombinant nucleic acid is 
expressed, inducing a protective immunological response in the mammal. 

A "regulatory element" refers to a nucleic acid sequence usually derived from untranslated 

20 regions of a gene and includes enhancers, promoters, introns, and 5' and 3' untranslated regions 
(UTRs). Regulatory elements interact with host or viral proteins which control transcription, 
translation, or RNA stability. 

"Reporter molecules" are chemical or biochemical moieties used for labeling a nucleic acid, 
amino acid, or antibody. Reporter molecules include radionuclides; enzymes; fluorescent, 

25 chemilmrdnescent, or chromogenic agents; substrates; cofactors; inhibitors; nfiagnetic particles; and 
other moieties known in the art. 

An "RNA equivalent," in reference to a DNA molecule, is composed of the same linear 
sequence of nucleotides as the reference DNA molecule with the exception that all occurrences of the 
nitrogenous base thymine are replaced with uracil, and the sugar backbone is composed of ribose 

30 instead of deoxyribose. 

The term "sample" is used in its broadest sense. A sample suspected of containing NAAP, 
nucleic acids encoding NAAP, or fragments thereof may comprise a bodily fluid; an extract from a 
cell, chromosome, organelle, or membrane isolated from a cell; a cell; genomic DNA, RNA, or 
cDNA, in solution or bound to a substrate; a tissue; a tissue print; etc. 

35 The terms "specific binding" and "specifically binding" refer to that interaction between a 
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protein or peptide and an agonist, an antibody, an antagonist, a small molecule, or any natural or 
synthetic binding composition. The interaction is dependent upon the presence of a particular 
structure of the protein, e.g., the antigenic determinant or epitope, recognized by the binding 
molecule. For example, if an antibody is specific for epitope "A," the presence of a polypeptide 
5 comprising the epitope A, or the presence of free unlabeled A, in a reaction containing free labeled A 
and the antibody will reduce the amount of labeled A that binds to the antibody. 

The term "substantially purified" refers to nucleic acid or amino acid sequences that are 
removed from their natural environment and are isolated or separated, and are at least about 60% free, 
preferably at least about 75% free, and most preferably at least about 90% free from other 

10 components with which they are naturally associated. 

A "substitution" refers to the replacement of one or more amino acid residues or nucleotides 
by different amino acid residues or nucleotides, respectively. 

"Substrate" refers to any suitable rigid or semi-rigid support including membranes, filters, 
chips, slides, wafers, fibers, magnetic or nonmagnetic beads, gels, tubing, plates, polymers, 

15 microparticles and capillaries. The substrate can have a variety of surface forms, such as wells, 
trenches, pins, channels and pores, to which polynucleotides or polypeptides are bound. 

A "transcript image" or "expression profile" refers to the collective pattern of gene 
expression by a particular cell type or tissue under given conditions at a given time. 

"Transformation" describes a process by which exogenous DNA is introduced into a recipient 

20 cell. Transformation may occur under natural or artificial conditions according to various methods 
well known in the art, and may rely on any known method for the insertion of foreign nucleic acid 
sequences into a prokaryotic or eukaryotic host cell. The method for transformation is selected based 
on the type of host cell being transformed and may include, but is not limited to, bacteriophage or 
viral infection, electroporation, heat shock, lipofection, and particle bombardment. The term 

25 "transformed cells" includes stably transformed cells in which the inserted DNA is capable of 

replication either as an autonomously replicating plasnndd or as part of the host chromosome, as well 
as transiently transformed cells which express the inserted DNA or RNA for linnited periods of time. 

A "transgenic organism," as used herein, is any organism, including but not limited to 
animals and plants, in which one or more of the cells of the organism contains heterologous nucleic 

30 acid introduced by way of human intervention, such as by transgenic techniques well known in the 
art. The nucleic acid is introduced into the cell, directly or indirectly by introduction into a precursor 
of the cell, by way of deliberate genetic manipulation, such as by microinjection or by infection with 
a recombinant virus. In another embodiment, the nucleic acid can be introduced by infection with a 
recombinant viral vector, such as a lentiviral vector (Lois, C. et al. (2002) Science 295:868-872). The 

35 term genetic manipulation does not include classical cross-breeding, or in vitro fertilization, but rather 
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is directed to the introduction of a recombinant DNA molecule. The transgenic organisms 
contemplated in accordance with the present invention include bacteria, cyanobacteria, fungi, plants 
and animals. The isolated DNA of the present invention can be introduced into the host by methods 
known in the art, for example infection, transfection, transformation or transconjugation. Techniques 
5 for transferring the DNA of the present invention into such organisms are widely known and provided 
in references such as Sambrook et al. (1989), supra. 

A "variant" of a particular nucleic acid sequence is defined as a nucleic acid sequence having 
at least 40% sequence identity to the particular nucleic acid sequence over a certain length of one of 
the nucleic acid sequences using blastn with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 

10 1999) set at default parameters. Such a pair of nucleic acids may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 85%, at least 90%, at least 91%, at least 92%, at least 
93%, at least 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater 
sequence identity over a certain defined length. A variant may be described as, for example, an 
"allelic" (as defined above), "splice," "species," or "polymorphic" variant. A splice variant may have 

15 significant identity to a reference molecule, but will generally have a greater or lesser number of 
polynucleotides due to alternate splicing of exons during mRNA processing. The corresponding 
polypeptide may possess additional functional domains or lack domains that are present in the 
reference molecule. Species variants are polynucleotides that vary from one species to another. The 
resulting polypeptides will generally have significant amino acid identity relative to each other. A 

20 polymorphic variant is a variation in the polynucleotide sequence of a particular gene between 
individuals of a given species. Polymorphic variants also may encompass "single nucleotide 
polymorphisms" (SNPs) in which the polynucleotide sequence varies by one nucleotide base. The 
presence of SNPs may be indicative of, for example, a certain population, a disease state, or a 
propensity for a disease state. 

25 A "variant" of a particular polypeptide sequence is defined as a polypeptide sequence having 

at least 40% sequence identity to the particular polypeptide sequence over a certain length of one of 
the polypeptide sequences using blastp with the "BLAST 2 Sequences" tool Version 2.0.9 (May-07- 
1999) set at default parameters. Such a pair of polypeptides may show, for example, at least 50%, at 
least 60%, at least 70%, at least 80%, at least 90%, at least 91%, at least 92%, at least 93%, at least 

30 94%, at least 95%, at least 96%, at least 97%, at least 98%, or at least 99% or greater sequence 
identity over a certain defined length of one of the polypeptides. 

THE INVENTION 

Various embodiments of the invention include new human nucleic acid-associated proteins 
35 (NAAP), the polynucleotides encoding NAAP, and the use of these compositions for the diagnosis. 
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treatment, or prevention of cell proliferative, neurological, developmental, and 
autoimmune/inflamniatory disorders, and infections. 

Table 1 summarizes the nomenclature for the full length polynucleotide and polypeptide 
embodiments of the invention. Each polynucleotide and its correspondmg polypeptide are correlated 
5 to a single Incyte project identification number (Ihcyte Project ID). Each polypeptide sequence is 
denoted by both a polypeptide sequence identification number (Polypeptide SEQ ID NO:) and an 
Incyte polypeptide sequence number (Incyte Polypeptide ID) as shown. Each polynucleotide 
sequence is denoted by both a polynucleotide sequence identification number (Polynucleotide SEQ 
ID NO:) and an Incyte polynucleotide consensus sequence number (Incyte Polynucleotide ID) as 

10 shown. Column 6 shows the Incyte ID numbers of physical, full length clones corresponding to 
polypeptide and polynucleotide embodiments. The full length clones encode polypeptides which 
have at least 95% sequence identity to the polypeptides shown in colmnn 3. 

Table 2 shows sequences with homology to the polypeptides of the invention as identified by 
BLAST analysis against the GenBank protein (genpept) database. Columns 1 and 2 show the 

15 polypeptide sequence identification number (Polypeptide SEQ ID NO:) and the corresponding Incyte 
polypeptide sequence number (Incyte Polypeptide ID) for polypeptides of the invention. Coliram 3 
shows the GenBank identification number (GenBank ID NO:) of the nearest GenBank homolog. 
Column 4 shows the probability scores for the matches between each polypeptide and its homolog(s). 
Column 5 shows the annotation of the GenBank homolog(s) along with relevant citations where 

20 applicable, all of which are expressly incorporated by reference herein. 

Table 3 shows various structural features of the polypeptides of the invention. Columns 1 
and 2 show the polypeptide sequence identification number (SEQ ID NO:) and the corresponding 
Incyte polypeptide sequence number (Incyte Polypeptide ID) for each polypeptide of the invention. 
Column 3 shows the number of amino acid residues in each polypeptide. Colimm 4 shows potential 

25 phosphorylation sites, and column 5 shows potential glycosylation sites, as determined by the 
MOTIFS program of the GCG sequence analysis software package (Genetics Computer Group, 
Madison WI). Column 6 shows amino acid residues comprising signature sequences, domains, and 
motifs. Column 7 shows analytical methods for protein structure/function analysis and in some cases, 
searchable databases to which the analytical methods were applied. 

30 Together, Tables 2 and 3 summarize the properties of polypeptides of the invention, and these 

properties establish that the claimed polypeptides are nucleic acid-associated proteins. For example, 
SEQ ID NO: 1 is 80% identical, from residue P24 to residue T316, to Rattus rattus ribosomal protein 
S2 (GenBank ID g57718) as determined by the Basic Local Alignment Search Tool (BLAST). (See 
Table 2.) The BLAST probability score is 2.2e-l 16, which indicates the probability of obtaining the 

35 observed polypeptide sequence alignment by chance. SEQ ID NO: 1 also contains a ribosomal protein 
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S5 domain as determined by searching for statistically significant matches in the hidden Markov 
model (HMM)-based PFAM database of conserved protein family domains. (See Table 3.) Data 
from BLIMPS and PROFILESCAN analyses and BLAST analyses of the PRODOM and DOMO 
databases provide further corroborative evidence that SEQ ID NO: 1 is a ribosomal protein. In 
5 another example, SEQ ID NO:4 is 97% identical, from residue S650 to residue Rl 142, to human 
CAGH32 (GenBank ID g2565061) as determined by BLAST. The BLAST probability score is 1.8e- 
254. SEQ ID NO:4 also contains a helicase conserved C-tenmnal domain as detennined by searching 
for statistically significant matches in the hidden Markov model (HMM)-based PFAM database . 
Data from MOTIFS analysis and BLAST analysis of the PRODOM and DOMO databases provide 

10 further corroborative evidence that SEQ ID NO:4 is a DNA modification enzyme such as a helicase. 
In another example, SEQ ID NO: 10 is 91% identical from residue V59 to residue E290, and 58% 
identical from residue Ml to residue P275, to Bos taurus transcription factor EF1(A) (GenBank ID 
gl62983) as determined BLAST. The BLAST probability score from residue V59 to residue E290 is 
6.7e-l 15. SEQ ID NO: 10 also contains a "cold shock" DNA-binding donomn as determined by 

15 searching for statistically significant matches in the hidden Markov model (HMM)-based PFAM 

database. Data from BLIMPS, MOTIFS, and PROFILESCAN analyses provide further corroborative 
evidence that SEQ ID NO: 10 is a transcription factor. In another example, SEQ ID NO:22 is 88% 
identical, from residue Ml to residue H451i to human testis-specific RING finger protein (GenBank 
ID g9650982) as determined by BLAST. The BLAST probability score is 1.8e-215. SEQ ID NO:22 

20 also contains SPRY, B-box zinc-finger, and zinc-finger type C3HC4 domains as determined by 
searching for statistically significant matches in the hidden Markov model (HMM)-based PFAM 
database. Data from BLAST-DOMO and MOTIFS analyses provide further corroborative evidence 
that SEQ ID NO: 22 is a RFP transforming protein or cell attachment sequence. In another example, 
SEQ ID NO:25 is 95% identical, from residue Ml to residue D334, to mouse ventral anterior 

25 homeobox-containing protein-1 (GenBank ID g3641258) as determined by BLAST. The BLAST 
probability score is 2.7e-166. SEQ ID NO:25 also contains a homeobox domain as detemnined by 
searching for statistically significant matches in the hidden Markov model (HMM)-based PFAM 
database. Data from BLIMPS, MOTIFS, and PROFILESCAN analyses provide further corroborative 
evidence that SEQ ID NO:25 is a homeobox-containing protein. In a further example, SEQ ID NO:31 

30 is 90% identical, from residue R130 to residue A816, and 68% identical, from residue Ml to residue 
S129 to human eukaryotic initiation factor, EIF2C1, (GenBank ID g6002623) as determined by 
BLAST. The BLAST probability score is 0.0. SEQ ID NO:31 also contains a PAZ (proteins Piwi, 
Argonaut, and Zwille/Pinhead) domain and a Piwi (a Drosophila protein which functions in RNA 
interference) domain as determined by searching for statistically significant matches in the hidden 

35 Markov model (HMM)-based PFAM. In yet another example, SEQ ID NO: 33 is 74% identical, from 
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residue Ml to residue Y255, to human zinc finger protein (GenBank ID g347906) as detemained by 
BLAST. The BLAST probability score is 3.6e-100. SEQ ID NO:33 also contains a KRAB box as 
well as a zinc finger domain as determined by searching for statistically significant matches in the 
hidden Markov model (HMM)-based PFAM database. Data fi-om MOTIFS and further BLAST 
5 analyses provide further corroborative evidence that SEQ ID NO:33 is a zinc finger protein. SEQ ID 
NO:2-3, SEQ ID NO:5-9, SEQ ID NO: 11-21, SEQ ID NO:23-24, SEQ ID NO:26-30, SEQ ID NO:32, 
and SEQ ID NO: 34-35 were analyzed and annotated in a similar manner. The algorithnas and 
parameters for the analysis of SEQ ID NO: 1-35 are described in Table 7. 

As shown in Table 4, the full length polynucleotide embodiments were assembled using 

10 cDNA sequences or coding (exon) sequences derived from genomic DNA, or any combination of 
these two types of sequences. Column 1 lists the polynucleotide sequence identification number 
(Polynucleotide SEQ ID NO:), the corresponding Incyte polynucleotide consensus sequence number 
(Incyte ID) for each polynucleotide of the invention, and the length of each polynucleotide sequence 
in basepairs. Column 2 shows the nucleotide start (5') and stop (3') positions of the cDNA and/or 

15 genomic sequences used to assemble the full length polynucleotide embodiments, and of fragments of 
the polynucleotides which are useful, for example, in hybridization or amplification technologies that 
identify SEQ ID NO:36-70 or that distinguish between SEQ ID NO:36-70 and related 
polynucleotides. 

The polynucleotide fragments described in Column 2 of Table 4 may refer specifically, for 

20 example, to Incyte cDNAs derived from tissue-specific cDNA libraries or from pooled cDNA 

libraries. Alternatively, the polynucleotide fragments described in column 2 may refer to GenBank 
cDNAs or BSTs which contributed to the assembly of the full length polynucleotides. In addition, the 
polynucleotide fragments described in colunmi 2 may identify sequences derived from the ENSEMBL 
(The Sanger Centre, Cambridge, UK) database (z.e., those sequences including the designation 

25 "ENST")- Alternatively, the polynucleotide fragments described in column 2 may be derived from 
the NCBI RefSeq Nucleotide Sequence Records Database (i.e., those sequences including the 
designation "NM" or "NT") or the NCBI RefSeq Protein Sequence Records (i.e., those sequences 
including the designation "NP"). Alternatively, the polynucleotide fragments described in column 2 
may refer to assemblages of both cDNA and Genscan-predicted exons brought together by an "exon 

30 stitching" algorithm. For example, a polynucleotide sequence identified as 

FL_XXXXXXJsljJ^2-J'YYYY_N3_N^ represents a "stitched" sequence in which XXXXXX is the 
identification number of the cluster of sequences to which the algorithm was applied, and YYYYY is 
the number of the prediction generated by the algorithm, and Nj^2,3,,.y if present, represent specific 
exons that may have been manually edited during analysis (See Example V). Alternatively, the 

35 polynucleotide fragments in column 2 may refer to assemblages of exons brought together by an 
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"exon-stretching" algorithm. For example, a polynucleotide sequence identified as 
FLXXXXXX__SAAAAA^BBBB_1_N is a "stretched" sequence, with XXXXXX being the Incyte 
project identification number, gAAAAA being the GenBank identification number of the human 
genomic sequence to which the "exon-stretching" algorithm was applied, gBBBBB being the 
5 GenBank identification number or NCBI RefSeq identification number of the nearest GenBank 
protein homolog, and AT referring to specific exons (See Example V). In instances where a RefSeq 
sequence was used as a protein homolog for the "exon-stretching" algorithm, a RefSeq identifier 
(denoted by 'mi," "NP," or "NT") may be used in place of the GenBank identifier (Le., gBBBBB). 

Altematively, a prefix identifies component sequences that were hand-edited, predicted from 
10 genomic DNA sequences, or derived fi:om a combination of sequence analysis methods. The 

following Table lists examples of component sequence prefixes and corresponding sequence analysis 
methods associated with the prefixes (see Example IV and Example V). 



Prefix 


Type of analysis and/or examples of programs 


GNN, GFG, 
ENST 


Exon prediction from genomic sequences using, for example, 
GENSCAN (Stanford University, CA, USA) or FGENES 
(Computer Genomics Group, The Sanger Centre, Cambridge, UK). 


GBI 


Hand-edited analysis of genomic sequences. 


EL 


Stitched or stretched genomic sequences (see Example V). 


INCY 


Full length transcript and exon prediction from mapping of EST 
sequences to the geiiome. Genomic location and EST composition 
data are combined to predict the exons and resulting transcript. 



20 In some cases, Ihcyte cDNA coverage redundant with the sequence coverage shown in Table 

4 was obtained to confirm the final consensus polynucleotide sequence, but the relevant Incyte cDNA 
identification numbers are not shown. 

Table 5 shows the representative cDNA libraries for those full length polynucleotides which 
were assembled using Ihcyte cDNA sequences. The representative cDNA library is the Ihcyte cDNA 
25 library which is most frequently represented by the Ihcyte cDNA sequences which were used to 
assemble and confirm the above polynucleotides. The tissues and vectors which were used to 
construct the cDNA libraries shown in Table 5 are described in Table 6. 

The invention also encompasses NAAP variants. A preferred NAAP variant is one which has 
at least about 80%, or altematively at least about 90%, or even at least about 95% amino acid 
30 sequence identity to the NAAP amino acid sequence, and which contains at least one functional or 
structural characteristic of NAAP. 

Various embodiments also encompass polynucleotides which encode NAAP. Ih a particular 
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embodiment, the invention encompasses a polynucleotide sequence comprising a sequence selected 
jfrom the group consisting of SEQ ID NO: 36-70, which encodes NAAP. The polynucleotide 
sequences of SEQ ID NO:36-70, as presented in the Sequence Listing, embrace the equivalent RNA 
sequences, wherein occurrences of the nitrogenous base thymine are replaced with uracil, and the 
5 sugar backbone is composed of ribose instead of deoxyribose. 

The invention also encompasses variants of a polynucleotide encoding NAAP. In particular, 
such a variant polynucleotide will have at least about 70%, or alternatively at least about 85%, or 
even at least about 95% polynucleotide sequence identity to a polynucleotide encoding NAAP. A 
particular aspect of the invention encompasses a variant of a polynucleotide comprising a sequence 

10 selected from the group consisting of SEQ ID NO:36-70 which has at least about 70%, or 

alternatively at least about 85%, or even at least about 95% polynucleotide sequence identity to a 
nucleic acid sequence selected from the group consisting of SEQ ID NO:36-70. Any one of the 
polynucleotide variants described above can encode a polypeptide which contains at least one 
functional or structural characteristic of NAAP. 

15 In addition, or in the alternative, a polynucleotide variant of the invention is a splice variant 

of a polynucleotide encoding NAAP. A splice variant may have portions which have significant 
sequence identity to a polynucleotide encoding NAAP, but will generally have a greater or lesser 
number of polynucleotides due to additions or deletions of blocks of sequence arising from alternate 
splicing of exons during mRNA processing. A splice variant may have less than about 70%, or 

20 altematively less than about 60%, or alternatively less than about 50% polynucleotide sequence 
identity to a polynucleotide encoding NAAP over its entire length; however, portions of the splice 
variant will have at least about 70%, or altematively at least about 85%, or altematively at least about 
95%, or altematively 100% polynucleotide sequence identity to portions of the polynucleotide 
encoding NAAP. Any one of the splice variants described above can encode a polypeptide which 

25 contains at least one functional or stmctural characteristic of NAAP. 

It will be appreciated by those skilled in the art that as a result of the degeneracy of the 
genetic code, a multitude of polynucleotide sequences encoding NAAP, some bearing minimal 
similarity to the polynucleotide sequences of any known and naturally occurring gene, may be 
produced. Thus, the invention contemplates each and every possible variation of polynucleotide 

30 sequence that could be made by selecting combinations based on possible codon choices. These 
combinations are made in accordance with the standard triplet genetic code as applied to the 
polynucleotide sequence of naturally occurring NAAP, and all such variations are to be considered as 
being specifically disclosed. 

Although polynucleotides which encode NAAP and its variants are generally capable of 

35 hybridizing to polynucleotides encoding naturally occurring NAAP under appropriately selected 
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conditions of stringency, it may be advantageous to produce polynucleotides encoding NAAP or its 
derivatives possessing a substantially different codon usage, e.g., inclusion of non-naturally occurring 
codons. Codons may be selected to increase the rate at which expression of the peptide occurs in a 
particular prokaryotic or eukaryotic host in accordance with the frequency with which particular 
5 codons are utilized by the host. Other reasons for substantially altering the nucleotide sequence 
encoding NAAP and its derivatives without altering the encoded amino acid sequences include the 
production of RNA transcripts having more desirable properties, such as a greater half-life, than 
transcripts produced from the naturally occurring sequence. 

The invention also encompasses production of polynucleotides which encode NAAP and 
10 NAAP derivatives, or fragments thereof, entirely by synthetic chemistry. After production, the 

synthetic polynucleotide may be inserted into any of the many available expression vectors and cell 
systems using reagents well known in the art. Moreover, synthetic chemistry may be used to 
introduce mutations into a polynucleotide encoding NAAP or any fragment thereof. 

Embodiments of the invention can also include polynucleotides that are capable of 
15 hybridizing to the claimed polynucleotides, and, in particular, to those having the sequences shown in 
SEQ ID NO: 36-70 and fragments thereof, under various conditions of stringency (Wahl, G.M. and 
S.L. Berger (1987) Methods Enzymol. 152:399-407; Kimmel, A.R. (1987) Methods EnzymoL 
152:507-5 1 1). Hybridization conditions, including annealing and wash conditions, are described in 
"Definitions." 

20 Methods for DNA sequencing are well known in the art and may be used to practice any of 

the embodiments of the invention. The methods may employ such enzymes as the Klenow fragment 
of DNA polymerase I, SEQUENASE (US Biochemical, Cleveland OH), Taq polymerase (Applied 
Biosystems), thermostable T7 polymerase (Amersham Biosciences, Piscataway NJ), or combinations 
of polymerases and proofreading exonucleases such as those found in the ELONGASE amplification 

25 system (Livitrogen, Carlsbad CA). Preferably, sequence preparation is automated with machines such 
as the MECROLAB 2200 liquid transfer system (Hamilton, Reno NV), PTC200 thermal cycler (MJ 
Research, Watertown MA) and ABI CATALYST 800 thermal cycler (Applied Biosystems). 
Sequencing is then carried out using either the ABI 373 or 377 DNA sequencing system (Applied 
Biosystems), the MEGAB ACE 1000 DNA sequencing system (Amersham Biosciences), or other 

30 systems known in the art. The resulting sequences are analyzed using a variety of algorithms which 
are well known in the art (Ausubel et al., supra, ch. 7; Meyers, R.A. (1995) Molecular Biolo^v and 
Biotechnology , Wiley YCH, New York NY, pp. 856-853). 

The nucleic acids encoding NAAP may be extended utilizing a partial nucleotide sequence 
and employing various PCR-based methods known in the art to detect upstream sequences, such as 

35 promoters and regulatory elements. For example, one method which may be employed, 
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restriction-site PGR, uses universal and nested primers to amplify unknown sequence from genomic 
DNA within a cloning vector (Sarkar, G. (1993) PGR Methods Applic. 2:318-322). Another method, 
inverse PGR, uses primers that extend in divergent directions to amplify unknown sequence from a 
circularized template. The template is derived from restriction fragments comprising a known 

5 genonodc locus and surrounding sequences (Triglia, T. et al. (1988) Nucleic Acids Res, 16:8186). A 
third method, capture PGR, involves PGR amplification of DNA fragments adjacent to known 
sequences in human and yeast artificial chromosome DNA (Lagerstrom, M. et al. (1991) PGR 
Methods Applic. 1:111-119). In this method, multiple restriction enzyme digestions and ligations 
may be used to insert an engineered double-stranded sequence into a region of unknown sequence 

10 before performing PGR. Other methods which noay be used to retrieve unknown sequences are 

known in the art (Parker, J.D. et al. (1991) Nucleic Acids Res. 19:3055-3060). Additionally, one may 
use PGR, nested primers, and PROMOTERFINDER libraries (Glontech, Palo Alto CA) to walk 
genomic DNA. This procedure avoids the need to screen libraries and is useful in finding intron/exon 
junctions. For all PGR-based methods, primers may be designed using commercially available 

15 software, such as OLIGO 4.06 primer analysis software (National Biosciences, Plymouth MN) or 
another appropriate program, to be about 22 to 30 nucleotides in length, to have a GC content of 
about 50% or more, and to anneal to the template at temperatures of about 68°G to 72°G. 

When screening for full length cDNAs, it is preferable to use libraries that have been 
size-selected to include larger cDNAs. In addition, random-primed libraries, which often include 

20 sequences containing the 5* regions of genes, are preferable for situations in which an oligo d(T) 

library does not yield a full-length cDNA. Genomic libraries may be usefiil for extension of sequence 
into 5* non-transcribed regulatory regions. 

Gapillary electrophoresis systems which are commercially available may be used to analyze 
the size or confirm the nucleotide sequence of sequencing or PGR products. Li particular, capillary 

25 sequencing may employ flowable polymers for electrophoretic separation, four different nucleotide- 
specific, laser-stimulated fluorescent dyes, and a charge coupled device camera for detection of the 
emitted wavelengths. Output/light intensity may be converted to electrical signal using appropriate 
software (e.g., GENOTYPER and SEQUENCE NAVIGATOR, Applied Biosystems), and the entire 
process from loading of samples to computer analysis and electronic data display may be computer 

30 controlled. Capillary electrophoresis is especially preferable for sequencing small DNA fragments 
which may be present in limited amounts in a particular sample. 

In another embodiment of the invention, polynucleotides or fragments thereof which encode 
NAAP may be cloned in recombinant DNA molecules that direct expression of NAAP, or fragments 
or functional equivalents thereof, in appropriate host cells. Due to the inherent degeneracy of the 

35 genetic code, other polynucleotides which encode substantially the same or a functionally equivalent 
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polypeptides may be produced and used to express NAAP. 

The polynucleotides of the invention can be engineered using methods generally known in the 
art in order to alter NAAP-encoding sequences for a variety of purposes including, but not limited to, 
modification of the cloning, processing, and/or expression of the gene product. DNA shuffling by 
5 random fragmentation and PGR reassembly of gene fragments and synthetic oligonucleotides may be 
used to engiueer the nucleotide sequences. For example, oligonucleotide-mediated site-directed 
mutagenesis may be used to introduce mutations that create new restriction sites, alter glycosylation 
pattems, change codon preference, produce splice variants, and so forth. 

The nucleotides of the present invention may be subjected to DNA shufQing techniques such 

10 as MOLECULARBREEDING (Maxygen Inc., Santa Clara CA; described in U.S. Patent No. 

5,837,458; Chang, C.-C. et al. (1999) Nat. Biotechnol. 17:793-797; Christians, P.C. et al. (1999) Nat. 
BiotechnoL 17:259-264; and Crameri, A. et al. (1996) Nat. Biotechnol. 14:315-319) to alter or 
improve the biological .properties of NAAP, such as its biological or enzymatic activity or its ability 
to bind to other molecules or compounds. DNA shufQing is a process by which a library of gene 

15 variants is produced using PCR-mediated recombination of gene fragments. The library is then 
subjected to selection or screening procedures that identify those gene variants with the desired 
properties. These preferred variants may then be pooled and further subjected to recursive rounds of 
DNA shuffling and selection/screening. Thus, genetic diversity is created through "artificial" 
breeding and rapid molecular evolution. For example, fragments of a single gene containing random 

20 point mutations may be recombined, screened, and then reshuffled until the desired properties are 
optimized. Alternatively, fragments of a given gene may be recombined with fragments of 
homologous genes in the same gene family, either from the same or different species, thereby 
maximizing the genetic diversity of multiple naturally occurring genes in a directed and controllable 
manner. 

25 In another embodiment, polynucleotides encoding NAAP may be synthesized, in whole or in 

part, using one or more chemical methods well known in the art (Caruthers, M.H. et al. (1980) 
Nucleic Acids Symp. Ser. 7:215-223; Horn, T, et al. (1980) Nucleic Acids Symp. Ser. 7:225-232). 
Alternatively, NAAP itself or a fragment thereof may be synthesized using chemical methods known 
in the art. For example, peptide synthesis can be performed using various solution-phase or 

30 solid-phase techniques (Creighton, T. (1984) Proteins. Structures and Molecular Properties , WH 
Freeman, New York NY, pp. 55-60; Roberge, J.Y. et al. (1995) Science 269:202-204). Automated 
synthesis may be achieved using the ABI 431 A peptide synthesizer (Applied Biosystems). 
Additionally, the amino acid sequence of NAAP, or any part thereof, may be altered during direct 
synthesis and/or combined with sequences from other proteins, or any part thereof, to produce a 

35 variant polypeptide or a polypeptide having a sequence of a naturally occurring polypeptide. 
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The peptide may be substantially purified by preparative high performance liquid 
chromatography (Chiez, R.M. and F.Z. Regnier (1990) Methods Enzymol. 182:392-421). The 
composition of the synthetic peptides may be confirmed by amino acid analysis or by sequencing. 
(Creighton, supra, pp. 28-53). 
5 In order to express a biologically active NAAP, the polynucleotides encoding NAAP or 

derivatives thereof noay be inserted iato an appropriate expression vector, i.e., a vector which contains 
the necessary elements for transcriptional and translational control of the inserted coding sequence in 
a suitable host. These elements include regulatory sequences, such as enhancers, constitutive and 
inducible promoters, and 5' and 3' untranslated regions in the vector and in polynucleotides encoding 

10 NAAP. Such elements may vary in thek strength and specificity. Specific initiation signals may also 
be used to achieve more efficient translation of polynucleotides encoding NAAP. Such signals 
include the ATG initiation codon and adjacent sequences, e.g. the Kozak sequence. In cases where a 
polynucleotide sequence encoding NAAP and its initiation codon and upstream regulatory sequences 
are inserted into the appropriate expression vector, no additional transcriptional or translational 

15 control signals may be needed. However, in cases where only coding sequence, or a firagment 
thereof, is inserted, exogenous translational control signals including an in-frame ATG initiation 
codon should be provided by the vector. Exogenous translational elements and initiation codons nnay 
be of various origins, both natural and synthetic. The efficiency of expression may be enhanced by 
the inclusion of enhancers appropriate for the particular host cell system used (Scharf, D. et al. (1994) 

20 Results Probl. Cell Differ. 20:125-162). 

Methods which are well known to those skilled in the art may be used to construct expression 
vectors containing polynucleotides encoding NAAP and appropriate transcriptional and translational 
control elements. These methods include in vitro recombinant DNA techniques, synthetic techniques, 
and in vivo genetic recombination (Sambrook, J. et al. (1989) Molecular Cloning, A Laboratory 

25 Manual , Cold Spring Harbor Press, Plainview NY, ch. 4, 8, and 16-17; Ausubel et al., supra, ch. 1, 3, 
and 15). 

A variety of expression vector/host systems may be utilized to contain and express 
polynucleotides encoding NAAP. These include, but are not limited to, microorganisms such as 
bacteria transformed with recombinant bacteriophage, plasmid, or cosmid DNA expression vectors; 

30 yeast transformed with yeast expression vectors; insect cell systems infected with viral expression 
vectors (e.g., baculovirus); plant cell systems transformed with viral expression vectors (e.g., 
cauliflower mosaic virus, CaMV, or tobacco mosaic virus, TMV) or with bacterial expression vectors 
(e.g., Ti or pBR322 plasmids); or animal cell systems (Sambrook, supra; Ausubel et al., supra; Van 
Heeke, G. and S.M. Schuster (1989) J. Biol. Chem. 264:5503-5509; Engelhard, E.K. et al. (1994) 

35 Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum. Gene Ther. 7: 1937-1945; 
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Takamatsu, N. (1987) EMBO J. 6:307-311; The McGraw Hill Yearbook of Science and Technology 
(1992) McGraw Hill, New York NY, pp. 191-196; Logan, J. and T. Shenk (1984) Proc. Natl. Acad. 
Sci. USA 81:3655-3659; Harrington, J.J. et al. (1997) Nat. Genet. 15:345-355). Expression vectors 
derived from retroviruses, adenoviruses, or herpes or vaccinia viruses, or from various bacterial 
5 plasmids, may be used for delivery of polynucleotides to the targeted organ, tissue, or cell population 
(Di Nicola, M. et al. (1998) Cancer Gen. Ther. 5:350-356; Yu, M. et al. (1993) Proc. Natl. Acad. Sci. 
USA 90:6340-6344; BuHer, R.M. et al. (1985) Nature 317:813-815; McGregor, D.P. et al. (1994) 
Mol. Immunol. 31:219-226; Verma, I.M. and N. Somia (1997) Nature 389:239-242). The invention is 
not limited by the host cell employed. 

10 In bacterial systems, a number of cloning and expression vectors may be selected depending 

upon the use intended for polynucleotides encoding NAAP. For example, routine cloning, 
subcloning, and propagation of polynucleotides encoding NAAP can be achieved using a 
multifrinctional E. coli vector such as PBLUESCRIPT (Stratagene, La JoUa CA) or PSPORTl 
plasmid (Invitrogen). Ligation of polynucleotides encoding NAAP into the vector's multiple cloning 

15 site disrupts the ZacZ gene, allowing a colorimetric screening procedure for identification of 

transformed bacteria containing recombinant molecules. In addition, these vectors may be useful for 
in vitro transcription, dideoxy sequencing, single strand rescue with helper phage, and creation of 
nested deletions in the cloned sequence (Van Heeke, G. and S.M. Schuster (1989) J. Biol. Chem. 
264:5503-5509). When large quantities of NAAP are needed, e.g. for the production of antibodies, 

20 vectors which direct high level expression of NAAP may be used. For example, vectors containing 
the strong, inducible SP6 or T7 bacteriophage promoter may be used. 

Yeast expression systems may be used for production of NAAP. A number of vectors 
containing constitutive or inducible promoters, such as alpha factor, alcohol oxidase, and PGH 
promoters, may be used in the yeast Saccharojiiyces cerevisiae or Pichia pastoris, hi addition, such 

25 vectors direct either the secretion or intracellular retention of expressed proteins and enable 

integration of foreign polynucleotide sequences into the host genome for stable propagation (Ausubel 
et al., supra\ Bitter, G.A. et al. (1987) Methods Enzymol. 153:516-544; Scorer, C.A. et al. (1994) 
Bio/Technology 12:181-184). 

Plant systems may also be used for expression of NAAP. Transcription of polynucleotides 

30 encoding NAAP may be driven by viral promoters, e.g., the 35S and 19S promoters of CaMV used 
alone or in combination with the omega leader sequence from TMV (Takamatsu, N. (1987) EMBO J. 
6:307-311). Alternatively, plant promoters such as the small subunit of RUBISCO or heat shock 
promoters may be used (Coruzzi, G. et al. (1984) EMBO J. 3:1671-1680; Broglie, R. et al. (1984) 
Science 224:838-843; Winter, J. et al. (1991) Results Probl. Cell Differ. 17:85-105). These 

35 constructs can be introduced into plant cells by direct DNA transformation or pathogen-mediated 
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transfection (The McGraw Hill Yearbook of Science and Technology (1992) McGraw Hill, New 
York NY, pp. 191-196). 

In mammalian cells, a number of viral-based expression systems may be utilized. In cases 
where an adenovirus is used as an expression vector, polynucleotides encoding NAAP imy be Ugated 
5 iuto an adenovirus transcription/translation complex consisting of the late promoter and tripartite 
leader sequence. Insertion in a non-essential El or E3 region of the viral genome may be used to 
obtain infective virus which expresses NAAP in host cells (Logan, J. and T. Shenk (1984) Proc, Natl. 
Acad. Sci. USA 81:3655-3659). In addition, transcription enhancers, such as the Rous sarcoma virus 
(RSV) enhancer, may be used to increase expression in mammaUan host cells. SV40 or EBV-based 

10 vectors may also be used for high-level protein expression. 

Human artificial chromosomes (HACs) may also be employed to deliver larger fragments of 
DNA than can be contained in and expressed from a plasmid. HACs of about 6 kb to 10 Mb are 
constructed and delivered via conventional delivery methods (liposomes, polycationic amino 
polymers, or vesicles) for therapeutic purposes (Harrington, J.J. et al. (1997) Nat. Genet. 15:345-355). 

15 For long term production of recombinant proteins in mammalian systems, stable expression 

of NAAP in cell lines is preferred. For example, polynucleotides encoding NAAP can be transformed 
into cell lines using expression vectors which may contain viral origins of replication and/or 
endogenous expression elements and a selectable marker gene on the same or on a separate vector. 
Following the introduction of the vector, cells may be allowed to grow for about 1 to 2 days in 

20 enriched media before being switched to selective media. The purpose of the selectable marker is to 
confer resistance to a selective agent, and its presence allows growth and recovery of cells which 
successfully express the introduced sequences. Resistant clones of stably transformed cells may be 
propagated using tissue culture techniques appropriate to the cell type. 

Any number of selection systems may be used to recover transformed cell lines. These 

25 include, but are not limited to, the herpes simplex virus thymidine kinase and adenine 

phosphoribosyltransferase genes, for use in tt and apr cells, respectively (Wigler, M. et al. (1977) 
Cell 11:223-232; Lowy, L et al. (1980) Cell 22:817-823). Also, antimetabolite, antibiotic, or 
herbicide resistance can be used as the basis for selection. For example, dhfr confers resistance to 
methotrexate; Jieo confers resistance to the aminoglycosides neomycin and G-418; and als and pat 

30 confer resistance to chlorsulfuron and phosphinotricin acetyltransferase, respectively (Wigler, M, et 
al. (1980) Proc. Natl. Acad. Sci. USA 77:3567-3570; Colbere-Garapin, F. et al. (1981) J. Mol. Biol. 
150:1-14). Additional selectable genes have been described, e.g., trpB and hisD, which alter cellular 
requirements for metabolites (Hartman, S.C. and R.C. Mulligan (1988) Proc. Natl. Acad. Sci. USA 
85:8047-8051). Visible markers, e.g., anthocyanins, green fluorescent proteins (GFP; Clontech), P- 

35 glucuronidase and its substrate p-glucuronide, or luciferase and its substrate luciferin may be used. 
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These markers can be used not only to identify transformants, but also to quantify the amount of 
transient or stable protein expression attributable to a specific vector system (Rhodes, C.A. (1995) 
Methods MoL Biol. 55:121-131). 

Although the presence/absence of marker gene expression suggests that the gene of interest is 
5 also present, the presence and expression of the gene naay need to be confirmed. For example, if the 
sequence encoding NAAP is inserted within a marker gene sequence, transformed cells containing 
polynucleotides encoding NAAP can be identified by the absence of marker gene function. 
Altematively, a marker gene can be placed in tandem with a sequence encoding NAAP under the 
control of a single promoter. Expression of the marker gene in response to induction or selection 

10 usually indicates expression of the tandem gene as well. 

In general, host cells that contain the polynucleotide encoding NAAP and that express NAAP 
may be identified by a variety of procedures known to those of skill in the art. These procedures 
include, but are not limited to, DNA-DNA or DNA-RNA hybridizations, PGR amplification, and 
protein bioassay or inamunoassay techniques which include membrane, solution, or chip based 

15 technologies for the detection and/or quantification of nucleic acid or protein sequences. 

Immunological methods for detecting and measuring the expression of NAAP using either 
specific polyclonal or monoclonal antibodies are known in the art. Examples of such techniques 
include enzyme-linked immunosorbent assays (ELISAs), radioimmunoassays (RIAs), and 
fluorescence activated cell sorting (FACS). A two-site, monoclonal-based immunoassay utilizing 

20 monoclonal antibodies reactive to two non-interfering epitopes on NAAP is preferred, but a 
competitive binding assay may be employed. These and other assays are well known in the art 
(Hampton, R. et al. (1990) Seroloigical Methods, a Laboratory Manual , APS Press, St. Paul MN, Sect. 
rV; Coligan, J.E. et al. (1997) Current Protocols in Immunology , Greene Pub. Associates and Wiley- 
Interscience, New York NY; Pound, J.D. (1998) Immunochemical Protocols , Humana Press, Totowa 

25 NJ). 

A wide variety of labels and conjugation techniques are known by those skilled in the art and 
may be used in various nucleic acid and amino acid assays. Means for producing labeled 
hybridization or PCR probes for detecting sequences related to polynucleotides encoding NAAP 
include oligolabeling, nick translation, end-labeling, or PCR amplification using a labeled nucleotide. 

30 Altematively, polynucleotides encoding NAAP, or any fragments thereof, may be cloned into a vector 
for the production of an mRNA probe. Such vectors are known in the art, are commercially available, 
and may be used to synthesize RNA probes in vitro by addition of an appropriate RNA polymerase 
such as T7, T3, or SP6 and labeled nucleotides. These procedures may be conducted using a variety 
of commercially available kits, such as those provided by Amersham Biosciences, Promega (Madison 

35 WI), and US Biochemical. Suitable reporter molecules or labels which may be used for ease of 
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detection include radionuclides, enzymes, fluorescent, chemiluminescent, or chromogenic agents, as 
well as substrates, cofactors, inhibitors, magnetic particles, and the like. 

Host cells transformed with polynucleotides encodrug NAAP may be cultured under 
conditions suitable for the expression and recovery of the protein from cell culture. The protein 
5 produced by a transformed cell may be secreted or retained intracellularly depending on the sequence 
and/or the vector used. As will be understood by those of skill in the art, expression vectors 
containing polynucleotides Which encode NAAP may be designed to contain signal sequences which 
direct secretion of NAAP through a prokaryotic or eukaryotic cell membrane. 

In addition, a host cell strain may be chosen for its ability to modulate expression of the 

10 inserted polynucleotides or to process the expressed protein in the desired fashion. Such 
modifications of the polypeptide include, but are not limited to, acetylation, carboxylation, 
glycosylation, phosphorylation, lipidation, and acylation. Post-translational processiug which cleaves 
a "prepro" or "pro" form of the protein may also be used to specify protein targeting, folding, and/or 
activity. Different host cells which have specific cellular machinery and characteristic mechanisms 

15 for post-translational activities (e.g., CHO, HeLa, MDCK, HEK293, and WI38) are available from the 
American Type Culture Collection (ATCC, Manassas VA) and may be chosen to ensure the correct 
modification and processing of the foreign protein. 

Li another embodiment of the invention, natural, modified, or recombinant polynucleotides 
encoding NAAP may be ligated to a heterologous sequence resulting in translation of a fusion protein 

20 in any of the aforementioned host systems. For example, a chimeric NAAP protein containing a 
heterologous moiety that can be recognized by a commercially available antibody may facilitate the 
screening of peptide libraries for inhibitors of NAAP activity. Heterologous protein and peptide 
moieties may also facilitate purification of fusion proteins using commercially available affinity 
matrices. Such moieties include, but are not limited to, glutathione S-transferase (GST), maltose 

25 binding protein (MBP), thioredoxin (Trx), calmodulin binding peptide (CBP), 6-His, FLAG, c-myc, 
and hemagglutinin (HA). GST, MBP, Trx, CBP, and 6-His enable purification of their cognate fusion 
proteins on immobilized glutathione, maltose, phenylarsine oxide, calmodulin, and metal-chelate 
resins, respectively. FLAG, c-myc, and hemagglutinin (HA) enable immunoaffinity purification of 
fusion proteins using commercially available monoclonal and polyclonal antibodies that specifically 

30 recognize these epitope tags. A fusion protein may also be engineered to contain a proteolytic 

cleavage site located between the NAAP encoding sequence and the heterologous protein sequence, 
so that NAAP may be cleaved away from the heterologous moiety following purification. Methods 
for fusion protein expression and purification are discussed in Ausubel et al. (supra, ch. 10 and 16). 
A variety of commercially available kits may also be used to facilitate expression and purification of 

35 fusion protems. 
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In another embodiment, synthesis of radiolabeled NAAP may be achieved in vitro using the 
TNT rabbit reticulocyte lysate or wheat germ extract system (Promega). These systems couple 
transcription and translation of protein-coding sequences operably associated with the T7, T3, or SP6 
promoters. Translation takes place in the presence of a radiolabeled amino acid precursor, for 
5 example, ^^S-mefhionine. 

NAAP, fragments of NAAP, or variants of NAAP may be used to screen for compounds that 
specifically bind to NAAP. One or more test compoimds may be screened for specific binding to 
NAAP. In various embodiments, 1, 2, 3, 4, 5, 10, 20, 50, 100, or 200 test compounds can be screened 
for specific binding to NAAP. Examples of test compounds can include antibodies, anticalins, 

10 oligonucleotides, proteins (e.g., ligands or receptors), or small molecules. 

In related embodiments, variants of NAAP can be used to screen for binding of test 
compounds, such as antibodies, to NAAP, a variant of NAAP, or a combination of NAAP and/or one 
or more variants NAAP. In an embodiment, a variant of NAAP can be used to screen for compounds 
that bind to a variant of NAAP, but not to NAAP having the exact sequence of a sequence of SEQ ID 

15 NO: 1-35. NAAP variants used to perform such screening can have a range of about 50% to about 
99% sequence identity to NAAP, with various embodiments having 60%, 70%, 75%, 80%, 85%, 
90%, and 95% sequence identity. 

In an embodiment, a compound identified in a screen for specific binding to NAAP can be 
closely related to the natural ligand of NAAP, e.g., a ligand or fragment thereof, a natural substrate, a 

20 structural or functional mimetic, or a natural binding partner (Coligan, I.E. et al. (1991) Current 

Protocols in Immunology l(2):Chapter 5). In another embodiment, the compound thus identified can 
be a natural ligand of a receptor NAAP (Howard, A.D. et al. (2001) Trends Pharmacol. Sci.22:132- 
140; Wise, A. et al. (2002) Drug Discovery Today 7:235-246). 

In other embodiments, a compound identified in a screen for specific binding to NAAP can 

25 be closely related to the natural receptor to which NAAP binds, at least a fragment of the receptor, or 
a fragment of the receptor including all or a portion of the ligand binding site or binding pocket. For 
example, the compound may be a receptor for NAAP which is capable of propagating a signal, or a 
decoy receptor for NAAP which is not capable of propagating a signal (Ashkenazi, A. and V.M. Divit 
(1999) Curr. Opin. Cell Biol. 11:255-260; Mantovani, A. et al. (2001) Trends frnmunol. 22:328-336). 

30 The compound can be rationally designed using known techniques. Examples of such techniques 
include those used to construct the compound etanercept (ENBREL; Lnmunex Corp., Seattle WA), 
which is efficacious for treating rheumatoid arthritis in humans. Etanercept is an engineered p75 
tumor necrosis factor (TNF) receptor dimer linked to the Fc portion of human IgGj (Taylor, P.C. et al. 
(2001) Curr. Opin. Immunol. 13:611-616). 

35 In one embodiment, two or more antibodies having similar or, alternatively, different 
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specificities can be screened for specific binding to NAAP, fragments of NAAP, or variants of 
NAAP. The binding specificity of the antibodies thus screened can thereby be selected to identify 
particiilar fragments or variants of NAAP. In one embodiment, an antibody can be selected such that 
its binding specificity allows for preferential identification of specific fragments or variants of 
5 NAAP. In another embodiment, an antibody can be selected such that its binding specificity allows 
for preferential diagnosis of a specific disease or condition having increased, decreased, or otherwise 
abnormal production of NAAP. 

In an embodiment, anticalins can be screened for specific binding to NAAP, fragments of 
NAAP, or variants of NAAP. Anticalins are ligand-binding proteins that have been constructed based 

10 on a lipocalm scaffold (Weiss, G.A. and H.B. Lowman (2000) Chem. Biol. 7:R177-R184; Skerra, A. 
(2001) J. Biotechnol. 74:257-275). The protein architecture of lipocalins can include a beta-barrel 
having eight antiparallel beta-strands, which supports four loops at its open end. These loops form 
the natural ligand-binding site of the lipocalins, a site which can be re-engineered in vitro by amino 
acid substitutions to impart novel binding specificities. The amino acid substitutions can be nciade 

15 using methods known in the art or described herein, and can include conservative substitutions (e.g., 
substitutions that do not alter binding specificity) or substitutions that modestly, moderately, or 
significantly alter binding specificity. 

In one embodiment, screening for compounds which specifically bind to, stimulate, or inhibit 
NAAP involves producing appropriate cells which express NAAP, either as a secreted protein or on 

20 the cell membrane. Preferred cells include cells from mammals, yeast, Drosophila, or E. colL Cells 
expressing NAAP or cell membrane fractions which contain NAAP are then contacted with a test 
compound and binding, stimulation, or irJiibition of activity of either NAAP or the compound is 
analyzed. 

An assay may simply test binding of a test compound to the polypeptide, wherein binding is 
25 detected by a fluorophore, radioisotope, enzyme conjugate, or other detectable label. For example, 
the assay may comprise the steps of combining at least one test compound with NAAP, either in 
solution or affixed to a solid support, and detecting the binding of NAAP to the compound. 

Alternatively, the assay may detect or measure binding of a test compound in the presence of a 
labeled competitor. Additionally, the assay may be carried out using cell-free preparations, chemical 
30 libraries, or natural product mixtures, and the test compound(s) may be free in solution or affixed to a 
solid support. 

An assay can be used to assess the ability of a compound to bind to its natural ligand and/or 
to inhibit the binding of its natural ligand to its natural receptors. Examples of such assays include 
radio-labeling assays such as those described in U.S. Patent No. 5,914,236 and U.S. Patent No. 
35 6,372,724. In a related embodiment, one or more amino acid substitutions can be introduced into a 
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polypeptide compound (such as a receptor) to improve or alter its ability to bind to its natural ligands 
(Matthews, DJ. and J. A. Wells. (1994) Cham. Biol. 1:25-30). In another related embodiment, one or 
more amino acid substitutions can be introduced into a polypeptide compound (such as a ligand) to 
improve or alter its ability to bind to its natural receptors (Cunningham, B.C. and J.A. Wells (1991) 
5 Proc. Natl. Acad. Sci. USA 88:3407-3411; Lowman, H.B. et al. (1991) J, Biol. Chem. 266:10982- 
10988). 

NAAP, fragments of NAAP, or variants of NAAP may be used to screen for compounds that 
modulate the activity of NAAP. Such compounds may include agonists, antagonists, or partial or 
iuverse agonists. In one embodiment, an assay is performed under conditions permissive for NAAP 

10 activity, wherein NAAP is combined with at least one test compound, and the activity of NAAP in the 
presence of a test compound is compared with the activity of NAAP in the absence of the test 
compound. A change in the activity of NAAP in the presence of the test compound is indicative of a 
compound that modulates the activity of NAAP. Alternatively, a test compound is combined with an 
in vitro or cell-free system comprising NAAP under conditions suitable for NAAP activity, and the 

15 assay is performed. In either of these assays, a test compoxmd which modulates the activity of NAAP 
may do so indirectly and need not come in direct contact with the test compoxmd. At least one and up 
to a plurality of test compounds may be screened. 

In another embodiment, polynucleotides encoding NAAP or their mammalian homologs may 
be "knocked out" iu aa animal model system using homologous recombination in embryonic stem 

20 (ES) cells. Such techniques are well known in the art and are useful for the generation of animal 
models of human disease (see, e.g., U.S. Patent No. 5,175,383 and U.S. Patent No. 5,767,337). For 
example, mouse ES cells, such as the mouse 129/SvJ cell line, are derived from the early mouse 
embryo and grown in culture. The ES cells are transformed with a vector containing the gene of 
interest disrupted by a marker gene, e.g., the neomycin phosphotransferase gene (neo; Capecchi, M.R. 

25 (1989) Science 244:1288-1292). The vector integrates into the corresponding region of the host 

genome by homologous recombination. Alternatively, homologous recombination takes place using 
the Cre-loxP system to knockout a gene of interest in a tissue- or developmental stage-specific 
manner (Marth, J.D. (1996) Clin. Invest. 97:1999-2002; Wagner, K.U. et al. (1997) Nucleic Acids 
Res. 25:4323-4330). Transformed ES cells are identified and microinjected into mouse cell 

30 blastocysts such as those from the C57BL/6 mouse strain. The blastocysts are surgically transferred 
to pseudopregnant dams, and the resulting chimeric progeny are genotyped and bred to produce 
heterozygous or homozygous strains. Transgenic animals thus generated may be tested with potential 
therapeutic or toxic agents. 

Polynucleotides encoding NAAP may also be manipulated in vitro in ES cells derived from 

35 human blastocysts. Human ES cells have the potential to differentiate into at least eight separate cell 
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lineages including endoderm, mesoderm, and ectodermal cell types. These cell lineages differentiate 
into, for example, neural cells, hematopoietic lineages, and cardiomyocytes (Thomson, J.A. et al. 
(1998) Science 282:1145-1147). 

Polynucleotides encoding NAAP can also be used to create "knockin" humanized animals 
5 (pigs) or transgenic animals (mice or rats) to model human disease. With knockin technology, a 

region of a polynucleotide encoding NAAP is injected into animal ES cells, and the injected sequence 
integrates into the animal cell genome. Transformed cells are injected into blastulae, and the 
blastulae are implanted as described above. Transgenic progeny or inbred lines are studied and 
treated with potential pharmaceutical agents to obtain information on treatment of a human disease.. 
10 Altematively, a mammal inbred to overexpress NAAP, e.g., by secreting NAAP in its milk, may also 
serve as a convenient source of that protein (Janne, J. et al. (1998) Biotechnol. Annu. Rev. 4:55-74). 
THERAPEUTICS 

Chemical and structural similarity, e.g., in the context of sequences and motifs, exists 
between regions of NAAP and nucleic acid-associated proteins. In addition, examples of tissues 

15 expressing NAAP can be found in Table 6 and can also be found in Example XL Therefore, NAAP 
appears to play a role in cell proliferative, neurological, developmental, and 
autoimmune/inflammatory disorders, and infections. In the treatment of disorders associated with 
increased NAAP expression or activity, it is desirable to decrease the expression or activity of NAAP. 
In the treatment of disorders associated with decreased NAAP expression or activity, it is desirable to 

20 increase the expression or activity of NAAP. 

Therefore, in one embodiment, NAAP or a fragment or derivative thereof may be , 
administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of NAAP. Examples of such disorders include, but are not limited to, a cell proliferative 
disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed 

25 coimective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, 

polycythemia vera, psoriasis, primary tlirombocythemia, and cancers including adenocarcinoma, 
leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, a cancer of 
the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, 
gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, 

30 salivary glands, skin, spleen, testis, thymus, thyroid, and uterus; a neurological disorder such as 

epilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease. Pick's 
disease, Huntington's disease, dementia, Parkinson's disease and other extrapyramidal disorders, 
amyotrophic lateral sclerosis and other motor neuron disorders, progressive neural muscular atrophy, 
retinitis pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinating diseases, bacterial 

35 and viral meningitis, brain abscess, subdural empyema, epidural abscess, suppurative intracranial 
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thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion diseases 
including kuru, Creutzfeldt-Jakob disease, and Gerstmann-Straussler-Scheinker syndrome, fatal 
familial insonania, nutritional and metabolic diseases of the nervous system, neurofibromatosis, 
tuberous sclerosis, cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental 
5 retardation and other developmental disorder of the central nervous system, cerebral palsy, a 

neuroskeletal disorder, an autonomic nervous system disorder, a cranial neirve disorder, a spinal cord 
disease, muscular dystrophy and other neuromuscular disorder, a peripheral nervous system disorder, 
dermatomyositis and polymyositis, inherited, metabolic, endocrine, and toxic myopathy, myasthenia 
gravis, periodic paralysis, a mental disorder including mood, anxiety, and schizophrenic disorder, 

10 seasonal affective disorder (SAD), akathesia, amnesia, catatonia, diabetic neuropathy, tardive 

dyskinesia, dystonias, paranoid psychoses, postherpetic neuralgia, progressive supranuclear palsy, 
corticobasal degeneration, and familial firontotemporal dementia, and Tourette's disorder; a 
developmental disorder such as renal tubular acidosis, anenoia. Gushing' s syndrome, achondroplastic 
dwarfism, Duchenne and Becker muscular dystrophy, epilepsy, gonadal dysgenesis, WAGR 

15 syndrome (Wilms' tumor, aniridia, genitourinary abnormalities, and mental retardation), Smith- 
Magenis syndrome, myelodysplastic syndrome, hereditary mucoepithelial dysplasia, hereditary 
keratodemoas, hereditary neuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis, 
hypothyroidism, hydrocephalus, seizure disorders such as Syndenham's chorea and cerebral palsy, 
spina bifida, anencephaly, craniorachischisis, congenital glaucoma, cataract, and sensorineural 

20 hearing loss; an autoimmune/inflammatory disorder such as acquired immunodeficiency syndrome 
(AIDS), Addison's disease, adult respiratory distress syndrome, allergies, ankylosing spondylitis, 
amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, 
autoimmune polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED), bronchitis, 
cholecystitis, contact dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes 

25 mellitus, emphysema, episodic lymphopenia with lymphocytotoxins, erythroblastosis fetalis, 

erythema nodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome, gout, Graves' 
disease, Hashimoto' s thyroiditis, hypereosinophilia, irritable bowel syndrome, multiple sclerosis, 
myasthenia gravis, myocardial or pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, 
polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, 

30 systemic anaphylaxis, systemic lupus erythematosus, systemic sclerosis, tlrrombocytopenic purpura, 
ulcerative colitis, uveitis, Wemer syndrome, complications of cancer, hemodialysis, and 
extracorporeal circulation, viral, bacterial, fungal, parasitic, protozoal, and helnninthic infections, and 
trauma; an infection, such as those caused by a viral agent classified as adenovirus, arenavirus, 
bunyavirus, calicivirus, coronavirus, filovirus, hepadnavirus, herpesvirus, flavivirus, orthomyxovirus, 

35 parvovirus, papovavims, paramyxovirus, picomavirus, poxvirus, reovirus, retrovirus, rhabdovirus, or 
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togavirus; an infection caused by a bacterial agent classified as pneumococcus, staphylococcus, 
streptococcus, bacillus, corynebacterium, Clostridium, meningococcus, gonococcus, listeria, 
moraxella, kingella, haemophilus, legionella, bordetella, gram-negative enterobacteriimi including 
shigella, salmonella, or Campylobacter, pseudomonas, vibrio, brucella, francisella, yersinia, 
5 bartonella, norcardium, actinomyces, mycobacterium, spirochaetale, rickettsia, chlamydia, or 
mycoplasma; an infection caused by a fungal agent classified as aspergillus, blastomyces, 
dermatophytes, cryptococcus, coccidioides, malasezzia, histoplasma, or other mycosis-causing fungal 
agent; and an infection caused by a parasite classified as Plasmodium or malaria-causing, parasitic 
entamoeba, leishmania, trypanosoma, toxoplasma, Pneumocystis carinii, intestinal protozoa such as 

10 giardia, trichomonas, tissue nematode such as trichinella, intestinal nematode such as ascaiis, 
lymphatic filarial nematode, trematode such as schistosoma, and cestode such as tapeworm. 

Ih another embodiment, a vector capable of expressing NAAP or a fragment or derivative 
thereof may be administered to a subject to treat or prevent a disorder associated with decreased 
expression or activity of NAAP including, but not limited to, those described above. 

15 hi a further embodiment, a composition comprising a substantially purified NAAP in 

conjunction with a suitable pharmaceutical carrier may be administered to a subject to treat or prevent 
a disorder associated with decreased expression or activity of NAAP including, but not limited to, 
those provided above. 

ha still another embodiment, an agonist which modulates the activity of NAAP may be 

20 administered to a subject to treat or prevent a disorder associated with decreased expression or 
activity of NAAP including, but not limited to, those listed above. 

Ih a further embodiment, an antagonist of NAAP may be administered to a subject to treat or 
prevent a disorder associated with increased expression or activity of NAAP. Examples of such 
disorders include, but are not limited to, those cell proliferative, neurological, developmental, and 

25 autoimmune/inflammatory disorders, and infections described above. In one aspect, an antibody 
which specifically binds NAAP may be used directly as an antagonist or indirectly as a targeting or 
delivery mechanism for bringing a pharmaceutical agent to cells or tissues which express NAAP. 

In an additional embodiment, a vector expressing the complement of the polynucleotide 
encoding NAAP may be administered to a subject to treat or prevent a disorder associated with 

30 increased expression or activity of NAAP including, but not limited to, those described above. 

In other embodiments, any protein, agonist, antagonist, antibody, complementary sequence, 
or vector embodiments may be administered in combination with other appropriate therapeutic 
agents. Selection of the appropriate agents for use in combination therapy may be made by one of 
ordinary skill in the art, according to conventional pharmaceutical principles. The combination of 

35 therapeutic agents may act synergistically to effect the treatment or prevention of the various 
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disorders described above. Using this approach, one may be able to achieve therapeutic efficacy with 
lower dosages of each agent, thus reducing the potential for adverse side effects. 

An antagonist of NAAP may be produced using methods which are generally known in the 
art. In particular, purified NAAP may be used to produce antibodies or to screen libraries of 
5 pharmaceutical agents to identify those which specifically bind NAAP. Antibodies to NAAP may 
also be generated using methods that are well known in the art. Such antibodies may include, but are 
not limited to, polyclonal, monoclonal, chimeric, and single chain antibodies, Fab fragments, and 
fragments produced by a Fab expression library. Neutralizing antibodies (i.e., those which inhibit 
dimer formation) are generally preferred for therapeutic use. Single chain antibodies (e.g., from 

10 camels or llamas) may be potent enzyme inhibitors and may have advantages in the design of peptide 
mimetics, and in the development of immuno-adsorbents and biosensors (Muyldermans, S. (2001) J. 
Biotechnol. 74:277-302). 

For the production of antibodies, various hosts including goats, rabbits, rats, mice, camels, 
dromedaries, llamas, humans, and others may be immunized by injection with NAAP or with any 

15 fragment or oligopeptide thereof which has immunogenic properties. Depending on the host species, 
various adjuvants may be used to increase immunological response. Such adjuvants include^but are 
not limited to, Freund's, mineral gels such as aluminum hydroxide, and surface active substances such 
as lysolecithin, plmonic polyols, polyanions, peptides, oil emulsions, KLH, and dinitrophenol. 
Among adjuvants used in humans, BCG (bacilli Cahnette-Guerin) and Corynebacterium parvum are 

20 especially preferable. 

It is preferred that the oligopeptides, peptides, or fragments used to induce antibodies to 
NAAP have an amino acid sequence consisting of at least about 5 amino acids, and generally will 
consist of at least about 10 amino acids. It is also preferable that these oligopeptides, peptides, or 
fragments are identical to a portion of the amino acid sequence of the natural protein. Short stretches 

25 of NAAP amino acids may be fused with those of another protein, such as KLH, and antibodies to the 
chimeric molecule may be produced. 

Monoclonal antibodies to NAAP may be prepared using any technique which provides for the 
production of antibody molecules by continuous cell lines in culture. These include, but are not 
limited to, the hybridoma technique, the human B-cell hybridoroa technique, and the EBV-hybridoma 

30 technique (Kohler, G. et al. (1975) Nature 256:495-497; Kozbor, D. et al. (1985) J. Immunol. 

Methods 81:31-42; Cote, R.J. et al. (1983) Proc. Natl. Acad. Sci. USA 80:2026-2030; Cole, S.P. et al. 
(1984) Mol. Cell Biol. 62:109-120). 

In addition, techniques developed for the production of "chimeric antibodies," such as the 
splicing of mouse antibody genes to human antibody genes to obtain a molecule with appropriate 

35 antigen specificity and biological activity, can be used (Morrison, S.L. et al. (1984) Proc. Natl. Acad. 
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Sci. USA 81:6851-6855; Neuberger, M.S. et al. (1984) Nature 312:604-608; Takeda, S. et al. (1985) 
Nature 314:452-454). Alternatively, techniques described for the production of single chain 
antibodies naay be adapted, using naethods known in the art, to produce NAAP-specific single chain 
antibodies. Antibodies with related specificity, but of distinct idiotypic conGiposition, may be 
5 generated by chain shuffling from random combinatorial immunoglobulin libraries (Burton, D.R. 
(1991) Proc. Natl. Acad. Sci. USA 88:10134-10137). 

Antibodies may also be produced by inducing in vivo production in the lymphocyte 
population or by screening immunoglobulin libraries or panels of highly specific binding reagents as 
disclosed in the literature (Orlandi, R. et al. (1989) Proc. Natl. Acad. Sci. USA 86:3833-3837; Winter, 

10 G. et al. (1991) Nature 349:293-299). 

Antibody fragments which contain specific binding sites for NAAP may also be generated. 
For example, such fragments include, but are not limited to, F(ab% fragments produced by pepsin 
digestion of the antibody molecule and Fab fragments generated by reducing the disulfide bridges of 
the F(ab')2 fragments. Alternatively, Fab expression libraries may be constructed to allow rapid, and 

15 easy identification of monoclonal Fab fragments with the desired specificity (Huse, W.D. et al. (1989) 
Science 246:1275-1281). 

Various immunoassays may be used for screening to identify antibodies having the desired 
specificity. Numerous protocols for competitive binding or immunoradiometric assays using either 
polyclonal or monoclonal antibodies with established specificities are well known in the art. Such 

20 immunoassays typically involve the measurement of complex formation between NAAP aad its 
specific antibody. A two-site, monoclonal-based immunoassay utilizing monoclonal antibodies 
reactive to two non-interfering NAAP epitopes is generally used, but a competitive binding assay may 
also be employed (Pound, supra). 

Various methods such as Scatchard analysis in conjunction with radioimmunoassay 

25 techniques may be used to assess the affinity of antibodies for NAAP. Affinity is expressed as an 
association constant, K^, which is defined as the molar concentration of NAAP-antibody complex 
divided by the molar concentrations of free antigen and free antibody under equilibrium conditions. 
The Ka determined for a preparation of polyclonal antibodies, which are heterogeneous in their 
affinities for multiple NAAP epitopes, represents the average affinity, or avidity, of the antibodies for 

30 NAAP. The determined for a preparation of monoclonal antibodies, which are monospecific for a 
particular NAAP epitope, represents a true measure of affinity. High-affinity antibody preparations 
with Ka ranging from about 10^ to 10^^ L/mole are preferred for use in immunoassays in which the 
NAAP-antibody complex must withstand rigorous manipulations. Low-affinity antibody preparations 
with Ka ranging from about 10^ to 10^ L/mole are preferred for use in immunopurification and similar 

35 procedures which ultimately require dissociation of NAAP, preferably in active form, from the 
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antibody (Catty, D. (1988) Antibodies. Volume I: A Practical Approach , IRL Press, Washington DC; 
Liddell, J.E. and A. Cryer (1991) A Practical Guide to Monoclonal Antibodies , John Wiley & Sons, 
New York NY). 

The titer and avidity of polyclonal antibody preparations may be further evaluated to 
5 determine the quality and suitability of such preparations for certain downstream applications. For 
example, a polyclonal antibody preparation containing at least 1-2 mg specific antibody/ml, 
preferably 5-10 mg specific antibody/ml, is generally employed in procedures requiring precipitation 
of NAAP-antibody complexes. Procedures for evaluating antibody specificity, titer, and avidity, and 
guidelines for antibody quality and usage in various applications, are generally available (Catty, 

10 supra; CoUgan et al., supra). 

In another embodiment of the invention, polynucleotides encoding NAAP, or any fragment or 
complement thereof, may be used for therapeutic purposes. In one aspect, modifications of gene 
expression can be achieved by designing complementary sequences or antisense molecules (DNA, 
RNA, PNA, or modified oligonucleotides) to the coding or regulatory regions of the gene encoding 

15 NAAP. Such technology is well known in the art, and antisense oligonucleotides or larger fragments 
can be designed from various locations along the coding or control regions of sequences encoding 
NAAP (Agrawal, S., ed. (1996) Antisense Therapeutics , Humana Press, Totawa NJ). 

In therapeutic use, any gene delivery system suitable for introduction of the antisense 
sequences into appropriate target cells can be used. Antisense sequences can be deUvered 

20 mtracellularly in the form of an expression plasmid which, upon transcription, produces a sequence 
complementary to at least a portion of the cellular sequence encoding the target protein (Slater, J.E. et 
al. (1998) J. Allergy Clin. Immunol. 102:469-475; Scanlon, K.J. et al. (1995) 9:1288-1296). 
Antisense sequences can also be introduced intracellularly through the use of viral vectors, such as 
retrovirus and adeno-associated virus vectors (Miller, A.D. (1990) Blood 76:271; Ausubel et al., 

25 supra; Uckert, W. and W. Walther (1994) Pharmacol. Ther. 63:323-347). Other gene delivery 

mechanisms include liposome-derived systems, artificial viral envelopes, and other systeros known in 
the art (Rossi, J.J. (1995) Br. Med. Bull. 51:217-225; Boado, R.J. et al. (1998) J. Pharm. Sci. 87:1308- 
1315; Morris, M.C. et al. (1997) Nucleic Acids Res. 25:2730-2736). 

In another embodiment of the invention, polynucleotides encoding NAAP may be used for 

30 somatic or germline gene therapy. Gene therapy may be performed to (i) correct a genetic deficiency 
(e.g., in the cases of severe combined immunodeficiency (SCID)-Xl disease characterized by X- 
linked inheritance (Cavazzana-Calvo, M. et al. (2000) Science 288:669-672), severe combined 
immunodeficiency syndrome associated with an inherited adenosine deaminase (ADA) deficiency 
(Blaese, R.M. et al. (1995) Science 270:475-480; Bordignon, C. et al. (1995) Science 270:470-475), 

35 cystic fibrosis (Zabner, J. et al. (1993) Cell 75:207-216; Crystal, R.G. et al. (1995) Hum. Gene 
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Therapy 6:643-666; Crystal, R.G. et al. (1995) Hum. Gene Therapy 6:667-703), thalassamias, familial 
hypercholesterolemia, and hemophiUa resulting from Factor VUI or Factor IX deficiencies (Crystal, 
R.G. (1995) Science 270:404-410; Verma, LM. and N. Somia (1997) Nature 389:239-242)), (ii) 
express a conditionally lethal gene product (e.g., in the case of cancers which result from unregulated 
5 cell proliferation), or (iii) express a protein which affords protection against intracellular parasites 
(e.g., against human retroviruses, such as human inomunodeficiency virus (HIV) (Baltimore, D. 
(1988) Nature 335:395-396; Poeschla, E. et al. (1996) Proc. Natl. Acad. Sci. USA 93:11395-11399), 
hepatitis B or C virus (HBV, HCV); fungal parasites, such as Candida albicans and Paracoccidioides 
brasiliensis; and protozoan parasites such as Plasmodium falciparum and Trypanosoma cruzi)- In the 

10 case where a genetic deficiency in NAAP expression or regulation causes disease, the expression of 
NAAP from an appropriate population of transduced cells may alleviate the clinical manifestations 
caused hy the genetic deficiency. 

In a further embodiment of the invention, diseases or disorders caused by deficiencies in 
NAAP are treated by constructing naanamalian expression vectors encoding NAAP and introducing 

15 these vectors by mechanical means into NAAP-deficient cells. Mechanical transfer technologies for 
use with cells in vivo or ex vitro include (i) direct DNA microinjection into individual cells, (ii) 
ballistic gold particle delivery, (iii) liposome-mediated transfection, (iv) receptor-mediated gene 
transfer, and (v) the use of DNA transposons (Morgan, R.A. and W.F. Anderson (1993) Annu. Rev. 
Biochem. 62:191-217; Ivies, Z. (1997) Cell 91:501-510; Boulay, J.-L. and H. Recipon (1998) Cuir. 

20 Opin. Biotechnol. 9:445-450). 

Expression vectors that may be effective for the expression of NAAP include, but are not 
limited to, the PCDNA 3.1, EPITAG, PRCCMV2, PREP, PVAX, PCR2-TOPOTA vectors 
(hivitrogen, Carlsbad CA), PCMV-SCRDPT, PCMV-TAG, PEGSH/PERV (Stratagene, La Jolla CA), 
and PTET~OFF, PTET-ON, PTRE2, PTRE2-LUC, PTK-HYG (Clontech, Palo Alto CA). NAAP may 

25 be expressed using (i) a constitutively active promoter, (e.g., from cytomegalovirus (CMV), Rous 
sarcoma virus (RSV), SV40 virus, thymidine kinase (TK), or P-actin genes), (ii) an inducible 
promoter (e.g., the tetracycline-regulated promoter (Gossen, M. and H. Bujard (1992) Proc. Natl. 
Acad. Sci. USA 89:5547-5551; Gossen, M. et al. (1995) Science 268:1766-1769; Rossi, F.M.V. and 
H.M. Blau (1998) Curr. Opin. Biotechnol. 9:451-456), commercially available in the T-REX plasmid 

30 (Invitrogen)); the ecdysone-inducible promoter (available in the plasmids PVGRXR and PIND; 

Invitrogen); the FK506/rapamycin inducible promoter; or the RU486/mifepristone inducible promoter 
(Rossi, F.M.V. and H.M. Blau, supra)), or (iii) a tissue-specific promoter or the native promoter of 
the endogenous gene encoding NAAP from a normal individual. 

Commercially available liposome transformation kits (e.g., the PERFECT LIPID 

35 TRANSFECTION KIT, available from Invitrogen) allow one with ordinary skill in the art to deliver 
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polynucleotides to target cells in culture and require mininaal effort to optimize experimental 
parameters. In the alternative, transformation is performed using the calcium phosphate method 
(Graham, F.L. and A. J. Eb (1973) Virology 52:456-467), or by electroporation (Neumann, E. et al. 
(1982) EMBO J. 1:841-845). The introduction of DNA to primary cells requires modification of 
5 these standardized mammalian transfection protocols. 

In another embodiment of the invention, diseases or disorders caused by genetic defects with 
respect to NAAP expression are treated by constructing a retrovirus vector consisting of (i) the 
polynucleotide encoding NAAP under the control of an independent promoter or the retrovirus long 
terminal repeat (LTR) promoter, (ii) appropriate RNA packaging signals, and (iii) a Rev-responsive 

10 element (RRE) along with additional retrovirus cw-acting KNA sequences and coding sequences 
required for efficient vector propagation. Retrovirus vectors (e.g., PFB and PFBNEO) are 
commercially available (Stratagene) and are based on published data (Riviere, 1. et al. (1995) Proc. 
Natl. Acad. Sci. USA 92:6733-6737), incorporated by reference herein. The vector is propagated in 
an appropriate vector producing cell line (VPCL) that expresses an envelope gene with a tropism for 

15 receptors on the target cells or a promiscuous envelope protein such as VSVg (Armentano, D. et al. 
(1987) J. Virol. 61:1647-1650; Bender, M.A. etal. (1987) J. Virol. 61:1639-1646; Adam, M.A. and 
A.D. MUler (1988) J. Virol. 62:3802-3806; Dull, T. et al. (1998) J. Vkol. 72:8463-8471; Zufferey, R. 
et al. (1998) J. Virol. 72:9873-9880). U.S. Patent No. 5,910,434 to Rigg ("Method for obtaining 
retrovirus packaging cell lines producing high transducing efiSciency retroviral supernatant") 

20 discloses a method for obtaining retrovirus packaging cell lines and is hereby incorporated by 
reference. Propagation of retrovirus vectors, transduction of a population of cells (e.g., T- 
cells), and the return of transduced cells to a patient are procedures well known to persons skilled in 
the art of gene therapy and have been well docimiented (Ranga, U. et al. (1997) J. Virol. 71:7020- 
7029; Bauer, G. et al. (1997) Blood 89:2259-2267; Bonyhadi, M,L. (1997) J. Virol, 71:4707-4716; 

25 Ranga, U. et al. (1998) Proc. Natl. Acad, Sci. USA 95:1201-1206; Su, L. (1997) Blood 89:2283- 
2290). 

In an embodiment, an adenovirus-based gene therapy delivery system is used to deliver 
polynucleotides encoding NAAP to cells which have one or more genetic abnormalities with respect 
to the expression of NAAP. The construction and packaging of adenovirus-based vectors are well 

30 known to those with ordinary skill in the art. Replication defective adenovirus vectors have proven to 
be versatile for importing genes encoding immunoregulatory proteins into intact islets in the pancreas 
(Csete, M.E. et al. (1995) Transplantation 27:263-268). Potentially useful adenoviral vectors are 
described in U.S. Patent No. 5,707,618 to Armentano ("Adenovirus vectors for gene therapy"), 
hereby incorporated by reference. For adenoviral vectors, see also Antinozzi, P.A. et al. (1999; Annu. 

35 Rev. Nutr. 19:51 1-544) and Verma, I.M. and N. Somia (1997; Nature 18:389:239-242). 
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In another embodiment, a herpes-based, gene therapy delivery system is used to deliver 
polynucleotides encoding NAAP to target cells which have one or more genetic abnormalities with 
respect to the expression of NAAP. The use of herpes simplex virus (HSV)-based vectors may be 
especially valuable for introducing NAAP to cells of the central nervous system, for which HSV has a 
5 tropism. The construction and packaging of herpes-based vectors are well known to those with 

ordinary skill in the art. A replication-competent herpes simplex virus (HSV) type 1-based vector has 
been used to deliver a reporter gene to the eyes of primates (Liu, X. et al. (1999) Exp. Eye Res. 
169:385-395). The construction of a HSV-1 virus vector has also been disclosed in detail in U.S. 
Patent No. 5,804,413 to DeLuca ("Herpes simplex virus strains for gene transfer"), which is hereby 

10 incorporated by reference. U.S. Patent No. 5,804,413 teaches the use of recombinant HSV d92 which 
consists of a genome containing at least one exogenous gene to be transferred to a cell under the 
control of the appropriate promoter for purposes including human gene therapy. Also taught by this 
patent are the construction and use of recombinant HSV strains deleted for ICP4, ICP27 and ICP22. 
For HSV vectors, see also Goins, W.F. et al. (1999; J. Virol. 73:519-532) and Xu, H. et al. (1994; 

15 Dev. Biol. 163:152-161). The manipulation of cloned herpesvirus sequences, the generation of 
recombinant vims following the transfection of multiple plasmids containing different segments of 
the large herpesvirus genomes, the growth and propagation of herpesvinis, and the infection of cells 
with herpesvirus are techniques well known to those of ordinary skill in the art. 

In another embodiment, an alphavirus (positive, single-stranded RNA virus) vector is used to 

20 deliver polynucleotides encoding NAAP to target cells. The biology of the prototypic alphavirus, 
Semliki Forest Virus (SFV), has been studied extensively and gene transfer vectors have been based 
on the SFV genome (Garoff, H. and K.-J. Li (1998) Curr. Opin. Biotechnol. 9:464-469). During 
alphavirus RNA replication, a subgenomic RNA is generated that normally encodes the viral capsid 
proteins. This subgenomic RNA replicates to higher levels than the full length genomic RNA, 

25 resulting in the overproduction of capsid proteins relative to the viral proteins with enzymatic activity 
(e.g., protease and polymerase). Similarly, inserting the coding sequence for NAAP into the 
alphavirus genome in place of the capsid-coding region results in the production of a large number of 
NAAP-coding RNAs and the synthesis of high levels of NAAP in vector transduced cells. While 
alphavirus infection is typically associated with cell lysis within a few days, the ability to establish a 

30 persistent infection in hamster normal kidney cells (BHK-21) with a variant of Sindbis virus (SIN) 
indicates that the lytic replication of alphaviruses can be altered to suit the needs of the gene therapy 
application (Dryga, S.A. et al. (1997) Virology 228:74-83). The wide host range of alphaviruses will 
allow the introduction of NAAP into a variety of cell types. The specific transduction of a subset of 
cells in a population may require the sorting of cells prior to transduction. The methods of 

35 manipulating infectious cDNA clones of alphaviruses, performing alphavirus cDNA and RNA 
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txansfections, and performing alphavirus infections, are well known to those with ordinary skill in the 
art. 

Oligonucleotides derived from the transcription initiation site, e.g., between about positions 
-10 and +10 from the start site, may also be employed to inhibit gene expression. Similarly, 
5 inhibition can be achieved using triple helix base-pairing methodology. Triple helix pairing is useful 
because it causes inhibition of the ability of the double helix to open sufficiently for the binding of 
polymerases, transcription factors, or regulatory molecules. Recent therapeutic advances using 
triplex DNA have been described in the literature (Gee, J.E. et al. (1994) in Huber, B.E. and B.I. Carr, 
Molecular atid Imm unologic Approaches , Futura Publishing, Mt. EjLsco NY, pp. 163-177). A 

10 complementary sequence or antisense molecule may also be designed to block translation of mRNA 
by preventing the transcript from binding to ribosomes. 

Ribozymes, enzymatic RNA molecules, may also be used to catalyze the specific cleavage of 
RNA. The mechanism of ribozyme action involves sequence-specific hybridization of the ribozyme 
molecule to complementary target RNA, followed by endonucleolytic cleavage. For example, 

15 engineered hammerhead motif ribozyme molecules may specifically and efficiently catalyze 
endonucleolytic cleavage of RNA molecules encoding NAAP. 

Specific ribozyme cleavage sites within any potential RNA target are initially identified by 
scanning the target molecule for ribozyme cleavage sites, including the following sequences: GUA, 
GUU, and GUC. Once identified, short RNA sequences of between 15 and 20 ribonucleotides, 

20 corresponding to the region of the target gene containing the cleavage site, may be evaluated for 
secondary structural features which may render the oligonucleotide inoperable. The suitability of 
candidate targets may also be evaluated by testing accessibility to hybridization with complementary 
oligonucleotides using ribonuclease protection assays. 

Complementary ribonucleic acid molecules and ribozymes may be prepared by any method 

25 known in the art for the synthesis of nucleic acid molecules. These include techniques for chemically 
synthesizing oligonucleotides such as solid phase phosphoramidite chemical synthesis. Alternatively, 
RNA molecules may be generated by in vitro and in vivo transcription of DNA molecules encoding 
NAAP. Such DNA sequences may be incorporated into a wide variety of vectors with suitable RNA 
polymerase promoters such as T7 or SP6. Alternatively, these cDNA constructs that synthesize 

30 complementary RNA, constitutively or inducibly, can be introduced into cell lines, cells, or tissues. 

RNA molecules may be modified to increase intracellular stability and half-life. Possible 
modifications include, but are not limited to, the addition of flanking sequences at the 5' and/or 3' 
ends of the molecule, or the use of phosphorothioate or 2' O-methyl rather than phosphodiesterase 
linkages within the backbone of the molecule. This concept is inherent in the production of PNAs 

35 and can be extended in all of these molecules by the inclusion of nontraditional bases such as inosine, 
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queosine, and wybutosine, as well as acetyl-, methyl-, thio-, and similarly modified forms of adenine, 
cytidine, guanine, thymine, and uridine which are not as easily recognized by endogenous 
endonucleases. 

An additional embodiment of the invention encompasses a method for screening for a 
5 compound which is effective in altering expression of a polynucleotide encoding NAAP. Compounds 
which may be effective in altering expression of a specific polynucleotide may include, but are not 
limited to, oligonucleotides, antisense oligonucleotides, triple helix-forming oligonucleotides, 
transcription factors and other polypeptide transcriptional regulators, and non-macromolecular 
chemical entities which are capable of interacting with specific polynucleotide sequences. Effective 

10 compounds may alter polynucleotide expression by acting as either inhibitors or promoters of 
polynucleotide expression. Thus, in the treatment of disorders associated with increased NAAP 
expression or activity, a compound which specifically inhibits expression of the polynucleotide 
encoding NAAP may be therapeutically useful, and in the treatment of disorders associated with 
decreased NAAP expression or activity, a compound which specifically promotes expression of the 

15 polynucleotide encoding NAAP may be therapeutically useful. 

At least one, and up to a plurality, of test compounds may be screened for effectiveness in 
altering expression of a specific polynucleotide. A test compound may be obtained by any method 
commonly known in the art, including chemical modification of a compoimd known to be effective in 
altering polynucleotide expression; selection from an existing, commercially-available or proprietary 

20 library of naturally-occurring or non-natural chemical compounds; rational design of a compound 
based on chemical and/or structural properties of the target polynucleotide; and selection from a 
library of chemical compounds created combinatorially or randomly. A sample comprising a 
polynucleotide encoding NAAP is exposed to at least one test compound thus obtained. The sample 
noay comprise, for example, an intact or permeabilized cell, or an in vitro cell-free or reconstituted 

25 biochennical system. Alterations in the expression of a polynucleotide encoding NAAP are assayed 
by any method commonly known in the art. Typically, the expression of a specific nucleotide is 
detected by hybridization with a probe having a nucleotide sequence complementary to the sequence 
of the polynucleotide encoding NAAP. The amount of hybridization may be quantified, thus forming 
the basis for a comparison of the expression of the polynucleotide both with and without exposure to 

30 one or more test compounds. Detection of a change in the expression of a polynucleotide exposed to 
a test compound indicates that the test compound is effective in altering the expression of the 
polynucleotide. A screen for a compound effective in altering expression of a specific polynucleotide 
can be carried out, for example, using a Schizosaccharomyces pombe gene expression system (Atkins, 
D. et al. (1999) U.S. Patent No. 5,932,435; Amdt, G.M, et al. (2000) Nucleic Acids Res. 28:E15) or a 

35 human cell line such as HeLa cell (Clarke, M.L. et al. (2000) Biochem. Biophys. Res. Commun. 
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268:8-13). A particular embodiment of the present invention involves screening a combinatorial 
library of oligonucleotides (such as deoxyribonucleotides, ribonucleotides, peptide nucleic acids, and 
modified oligonucleotides) for antisense activity against a specific polynucleotide sequence (Bruice, 
T.W. et al. (1997) U.S. Patent No. 5,686,242; Bruice, T.W. et al. (2000) U.S. Patent No. 6,022,691). 
5 Many methods for introducing vectors into cells or tissues are available and equally suitable 

for use in vivo, in vitro, and ex vivo. For ex vivo therapy, vectors naay be introduced into stem cells 
taken from the patient and clonally propagated for autologous transplant back into that same patient. 
Delivery by transfection, by liposome injections, or by polycationic amino polymers noiay be achieved 
using methods which are well known in the art (Goldman, C.K. et al. (1997) Nat. Biotechnol. 15:462- 
10 466). 

Any of the therapeutic methods described above may be applied to any subject in need of 
such therapy, including, for example, mammals such as humans, dogs, cats, cows, horses, rabbits, and 
monkeys. 

An additional embodiment of the invention relates to the administration of a composition 

15 which generally comprises an active ingredient formulated with a pharmaceutically acceptable 
excipient. Excipients may include, for example, sugars, starches, celluloses, gums, and proteias. 
Various formulations are commonly known and are thoroughly discussed in the latest edition of 
Remington's Pharmaceutical Sciences (Maack Publishing, Easton PA). Such compositions noiay 
consist of NAAP, antibodies to NAAP, and mimetics, agonists, antagonists, or inhibitors of NAAP. 

20 The compositions utilized in this invention may be administered by any number of routes 

including, but not limited to, oral, intravenous, intramuscular, intra-arterial, intramedullary, 
intrathecal, intraventricular, pulmonary, transdermal, subcutaneous, intraperitoneal, intranasal, 
enteral, topical, sublingual, or rectal means. 

Compositions for pulmonary administration may be prepared in liquid or dry powder form. 

25 These compositions are generally aerosolized immediately prior to inhalation by the patient. In the 
case of small molecules (e.g. traditional low molecular weight organic drugs), aerosol delivery of 
fast-acting formulations is well-known in the art. In the case of macromolecules (e.g. larger peptides 
and proteins), recent developments in the field of pulmonary delivery via the alveolar region of the 
lung have enabled the practical delivery of drugs such as insulin to blood circulation (see, e.g., Patton, 

30 J.S. et al., U.S. Patent No. 5,997,848). Pulmonary delivery has the advantage of administration 
without needle injection, and obviates the need for potentially toxic penetration enhancers. 

Compositions suitable for use iu the invention include compositions wherein the active 
ingredients are contained in an effective amount to achieve the intended piupose. The determination 
of an effective dose is well within the capability of those skilled in the art. 

35 Specialized forms of compositions may be prepared for direct intracellular delivery of 
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macromolecules comprising NAAP or fragments thereof. For example, liposome preparations 
containing a cell-impermeable macromolecule may promote cell fusion and intracellular delivery of 
the macromolecule. Alternatively, NAAP or a fragment thereof may be joined to a short cationic N- 
terminal portion from the HIV Tat-1 protein. Fusion proteins thus generated have been found to 
5 transduce into the cells of all tissues, including the brain, in a mouse model system (Schwarze, S.R. et 
al. (1999) Science 285:1569-1572). 

For any compound, the therapeutically effective dose can be estimated mitially either in cell 
culture assays, e.g., of neoplastic cells, or in animal models such as mice, rats, rabbits, dogs, 
monkeys, or pigs. An animal model may also be used to determine the appropriate concentration 

10 rlange and route of administration. Such information can then be used to determine useful doses and 
routes for administration in humans. 

A therapeutically effective dose refers to that amount of active ingredient, for example NAAP 
or fragments thereof, antibodies of NAAP, and agonists, antagonists or inhibitors of NAAP, which 
ameliorates the symptoms or condition. Therapeutic efficacy and toxicity may be determined by 

15 standard pharmaceutical procedures in cell cultures or with experimental animals, such as by 

calculating the ED50 (the dose therapeutically effective in 50% of the population) or LD50 (the dose 
lethal to 50% of the population) statistics. The dose ratio of toxic to therapeutic effects is the 
therapeutic index, which can be expressed as the LD50/ED50 ratio. Compositions which exhibit large 
therapeutic indices are preferred. The data obtained from cell culture assays and animal studies are 

20 used to formulate a range of dosage for human use. The dosage contained in such compositions is 
preferably within a range of circulating concentrations that includes the ED50 with little or no toxicity. 
The dosage varies within this range depending upon the dosage form employed, the sensitivity of the 
patient, and the route of administration. 

The exact dosage will be determined by the practitioner, in light of factors related to the 

25 subject requiring treatment. Dosage and administration are adjusted to provide sufficient levels of the 
active moiety or to maintain the desired effect. Factors which may be taken into account include the 
severity of the disease state, the general health of the subject, the age, weight, and gender of the 
subject, time and frequency of administration, drug combination(s), reaction sensitivities, and 
response to therapy. Long-acting compositions may be administered every 3 to 4 days, every week, 

30 or biweekly depending on the half-life and clearance rate of the particular formulation. 

Normal dosage amounts may vary from about 0.1 /^g to 100,000 up to a total dose of 
about 1 gram, depending upon the route of administration. Guidance as to particular dosages and 
methods of delivery is provided in the literature and generally available to practitioners in the art. 
Those skilled in the art will employ different formulations for nucleotides than for proteins or their 

35 inhibitors. Similarly, delivery of polynucleotides or polypeptides will be specific to particular cells. 
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conditions, locations, etc. 
DIAGNOSTICS 

In another embodiment, antibodies which specifically bind NAAP may be used for the 
diagnosis of disorders characterized by expression of NAAP, or ia assays to monitor patients being 
5 treated with NAAP or agonists, antagonists, or inhibitors of NAAP. Antibodies useful for diagnostic 
purposes may be prepared in the same maimer as described above for therapeutics. Diagnostic assays 
for NAAP include methods which utilize the antibody and a label to detect NAAP in human body 
fluids or in extracts of cells or tissues. The antibodies may be used with or without modification, and 
may be labeled by covalent or non-covalent attachment of a reporter molecule. A wide variety of 

10 reporter molecules, several of which are described above, are known in the art and may be used. 

A variety of protocols for measuring NAAP, including ELISAs, RIAs, and FACS, are known 
ia the art and provide a basis for diagnosing altered or abnormal levels of NAAP expression. Normal 
or standard values for NAAP expression are established by combining body fluids or cell extracts 
taken from normal mammalian subjects, for example, human subjects, with antibodies to NAAP 

15 under conditions suitable for complex formation. The amount of standard complex formation may be 
quantitated by various methods, such as photometric means. Quantities of NAAP expressed m 
subject, control, and disease samples from biopsied tissues are compared with the standard values. 
Deviation between standard and subject values establishes the parameters for diagnosing disease. 

Li another embodiment of the invention, polynucleotides encoding NAAP may be used for 

20 diagnostic purposes. The polynucleotides which may be used include oligonucleotides, 

complementary RNA and DNA molecules, and PNAs. The polynucleotides may be used to detect 
and quantify gene expression in biopsied tissues in which expression of NAAP may be correlated 
with disease. The diagnostic assay may be used to determine absence, presence, and excess 
expression of NAAP, and to monitor regulation of NAAP levels during therapeutic intervention. 

25 In one aspect, hybridization with PGR probes which are capable of detecting polynucleotides, 

including genomic sequences, encoding NAAP or closely related molecules may be used to identify 
nucleic acid sequences which encode NAAP. The specificity of the probe, whether it is made from a 
highly specific region, e.g., the 5' regulatory region, or from a less specific region, e.g., a conserved 
motif, and the stringency of the hybridization or amplification will determine whether the probe 

30 identifies only naturally occurring sequences encoding NAAP, allelic variants, or related sequences. 

Probes may also be used for the detection of related sequences, and may have at least 50% 
sequence identity to any of the NAAP encoding sequences. The hybridization probes of the subject 
invention may be DNA or RNA and may be derived from the sequence of SEQ ID NO: 36-70 or from 
genomic sequences including promoters, enhancers, and introns of the NAAP gene. 

35 Means for producing specific hybridization probes for polynucleotides encoding NAAP 
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include the cloiiing of polynucleotides encoding NAAP or NAAP derivatives into vectors for the 

production of mRNA probes. Such vectors are known in the art, are commercially available, and may 

be used to synthesize RNA probes in vitro by means of the addition of the appropriate RNA 

polymerases and the appropriate labeled nucleotides. Hybridization probes may be labeled by a 

5 variety of reporter groups, for example, by radionuchdes such as ^^P or ^^S, or by enzymatic labels, 

such as alkaline phosphatase coupled to the probe via avidin/biotin coupling systems, and the like. 

Polynucleotides encoding NAAP may be used for the diagnosis of disorders associated with 

expression of NAAP. Examples of such disorders include, but are not limited to, a cell proliferative 

disorder such as actinic keratosis, arteriosclerosis, atherosclerosis, bursitis, cirrhosis, hepatitis, mixed 

« 

10 connective tissue disease (MCTD), myelofibrosis, paroxysmal nocturnal hemoglobinuria, 

polycytiiemia vera, psoriasis, primary thrombocythemia, and cancers including adenocarcinoma, 
leukemia, lymphoma, melanoma, myeloma, sarcoma, teratocarcinoma, and, in particular, a cancer of 
the adrenal gland, bladder, bone, bone marrow, brain, breast, cervix, gall bladder, ganglia, 
gastrointestinal tract, heart, kidney, liver, lung, muscle, ovary, pancreas, parathyroid, penis, prostate, 

15 salivary glands, skin, spleen, testis, thymus, thyroid, and uterus; a neurological disorder such as 

epilepsy, ischemic cerebrovascular disease, stroke, cerebral neoplasms, Alzheimer's disease. Pick's 
disease, Huntington's disease, dementia, Parkinson's disease and other extrapyramidal disorders, 
amyotrophic lateral sclerosis and other motor neuron disorders, progressive neural muscular atrophy, 
retiaitis pigmentosa, hereditary ataxias, multiple sclerosis and other demyelinating diseases, bacterial 

20 and viral meningitis, brain abscess, subdural empyema, epidural abscess, suppurative intracranial 
thrombophlebitis, myelitis and radiculitis, viral central nervous system disease, prion diseases 
including kuru, Creutzfeldt-Jakob disease, and Gerstmann-Straussler-Scheinker syndrome, fatal 
familial insomnia, nutritional and metabolic diseases of the nervous system, neurofibromatosis, 
tuberous sclerosis, cerebelloretinal hemangioblastomatosis, encephalotrigeminal syndrome, mental 

25 retardation and other developmental disorder of the central nervous system, cerebral palsy, a 

neuroskeletal disorder, an autonomic nervous system disorder, a cranial nerve disorder, a spinal cord 
disease, muscular dystrophy and other neuromuscular disorder, a peripheral nervous system disorder, 
dermatomyositis and polymyositis, inherited, metabolic, endocrine, and toxic myopathy, myasthenia 
gravis, periodic paralysis, a mental disorder including mood, anxiety, and schizophrenic disorder, 

30 seasonal affective disorder (SAD), akathesia, amnesia, catatonia, diabetic neuropathy, tardive 
dyskinesia, dystonias, paranoid psychoses, postherpetic neuralgia, and Tourette's disorder; a 
developmental disorder such as renal tubular acidosis, anemia. Gushing' s syndrome, achondroplastic 
dwarfism, Duchenne and Becker muscular dystrophy, epilepsy, gonadal dysgenesis, WAGR 
syndrome (Wilms' tumor, aniridia, genitourinary abnormalities, and mental retardation), Smith- 

35 Magenis syndrome, myelodysplastic syndrome, hereditary mucoepithelial dysplasia, hereditary 
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keratodermas, hereditary neuropathies such as Charcot-Marie-Tooth disease and neurofibromatosis, 
hypothyroidism, hydrocephalus, seizure disorders such as Syndenham's chorea and cerebral palsy, 
spina bifida, anencephaly, craniorachischisis, congenital glaucoma, cataract, and sensorineural 
hearing loss; an autoimmune/inflanomatory disorder such as acquired immunodeficiency syndrome 
5 (AIDS), Addison's disease, adult respiratory distress syndrome, allergies, ankylosing spondylitis, 

amyloidosis, anemia, asthma, atherosclerosis, autoimmune hemolytic anemia, autoimmune thyroiditis, 
autoinomune polyendocrinopathy-candidiasis-ectodermal dystrophy (APECED), bronchitis, 
cholecystitis, contact dermatitis, Crohn's disease, atopic dermatitis, dermatomyositis, diabetes 
mellitus, emphysema, episodic lymphopenia with lymphocytotoxins, erythroblastosis fetalis, 

10 erythema nodosum, atrophic gastritis, glomerulonephritis, Goodpasture's syndrome, gout. Graves' - 
disease, Hashimoto's thyroiditis, hypereosinophilia, irritable bowel syndrome, multiple sclerosis, 
myasthenia gravis, myocardial or pericardial inflammation, osteoarthritis, osteoporosis, pancreatitis, 
polymyositis, psoriasis, Reiter's syndrome, rheumatoid arthritis, scleroderma, Sjogren's syndrome, 
systemic anaphylaxis, systenadc lupus erythematosus, systemic sclerosis, thrombocytopenic purpura, 

15 ulcerative colitis, uveitis, Werner syndrome, complications of cancer, hemodialysis, and 

extracorporeal circulation, viral, bacterial, fungal, parasitic, protozoal, and helminthic infections, and 
trauma; an infection, such as those caused by a viral agent classified as adenovirus, arenavirus, 
bunyavirus, calicivirus, coronavirus, filovirus, hepadnavirus, herpesvirus, flavivirus, orthomyxovirus, 
parvovirus, papovavirus, paramyxovirus, picomavirus, poxvirus, reovirus, retrovirus, rhabdovirus, or 

20 togavirus; an infection caused by a bacterial agent classified as pneumococcus, staphylococcus, 
streptococcus, bacillus, corynebacterium, Clostridium, meningococcus, gonococcus, listeria, 
moraxella, kingella, haemophilus, legionella, bordetella, gram-negative enterobacterium including 
shigella, salmonella, or Campylobacter, pseudomonas, vibrio, brucella, firancisella, yersinia, 
bartonella, norcardium, actinomyces, mycobacterium, spirochaetale, lickettsia, chlamydia, or 

25 mycoplasma; an infection caused by a fungal agent classified as aspergillus, blastomyces, 

dermatophytes, cryptococcus, coccidioides, malasezzia, histoplasma, or other mycosis-causing fungal 
agent; and an infection caused by a psirasite classified as Plasmodium or malaria-causing, parasitic 
entamoeba, leishmania, trypanosoma, toxoplasma, Pneumocystis carinii, intestinal protozoa such as 
giardia, trichomonas, tissue nematode such as trichinella, intestinal nematode such as ascaris, 

30 lymphatic filarial nematode, trematode such as schistosoma, and cestode such as tapeworm. 

Polynucleotides encoding NAAP may be used in Southern or northern analysis, dot blot, or other 
membrane-based technologies; in PGR technologies; in dipstick, pin, and multiformat ELISA-like 
assays; and in microarrays utilizing fluids or tissues from patients to detect altered NAAP expression. 
Such qualitative or quantitative methods are well known in the art. 

35 In a particular aspect, polynucleotides encoding NAAP may be used in assays that detect the 
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presence of associated disorders, particularly those mentioned above. Polynucleotides 
complementary to sequences encoding NAAP naay be labeled by standard methods and added to a 
fluid or tissue sample from a patient under conditions suitable for the formation of hybridization 
complexes. After a suitable incubation period, the sample is washed and the signal is quantified and 
5 compared with a standard value. If the amount of signal in the patient sample is significantly altered 
in comparison to a control sample then the presence of altered levels of polynucleotides encoding 
NAAP in the sample indicates the presence of the associated disorder. Such assays may also be used 
to evaluate the efficacy of a particular therapeutic treatment regimen in anincial studies, in clinical 
trials, or to monitor the treatment of an individual patient. 

10 hi order to provide a basis for the diagnosis of a disorder associated with expression of 

NAAP, a normal or standard profile for expression is established. This may be accomplished by 
combining body fluids or cell extracts taken from normal subjects, either animal or human, with a 
sequence, or a fragment thereof, encoding NAAP, under conditions suitable for hybridization or 
ampUfication. Standard hybridization may be quantified by comparing the values obtained from 

15 normal subjects with values from an experiment in which a known amount of a substantially purified 
polynucleotide is used. Standard values obtained in this manner may be compared with values 
obtained from samples from patients who are symptomatic for a disorder. Deviation from standard 
values is used to establish the presence of a disorder. 

Once the presence of a disorder is established and a treatment protocol is initiated, 

20 hybridization assays may be repeated on a regular basis to determine if the level of expression in the 
patient begins to approximate that which is observed in the normal subject. The results obtained from 
successive assays may be used to show the efficacy of treatment over a period ranging from several 
days to months. 

With respect to cancer, the presence of an abnormal amount of transcript (either under- or 
25 overexpressed) in biopsied tissue from an individual may indicate a predisposition for the 

development of the disease, or may provide a means for detecting the disease prior to the appearance 
of actual clinical symptoms. A more definitive diagnosis of this type may allow health professionals 
to employ preventative measures or aggressive treatment earlier, thereby preventing the development 
or further progression of the cancer. 
30 Additional diagnostic uses for oligonucleotides designed from the sequences encoding NAAP 

may involve the use of PGR. These oligomers may be chemically synthesized, generated 
enzymatically, or produced in vitro. Oligomers will preferably contain a fragment of a polynucleotide 
encoding NAAP, or a fragment of a polynucleotide complementary to the polynucleotide encoding 
NAAP, and will be employed under optimized conditions for identification of a specific gene or 
35 condition. Oligomers may also be employed under less stringent conditions for detection or 
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quantification of closely related DNA or RNA sequences. 

In a particular aspect, oligonucleotide primers derived from polynucleotides encoding NAAP 
may be used to detect single nucleotide polymorphisms (SNPs). SNPs are substitutions, insertions 
and deletions that are a frequent cause of inherited or acquired genetic disease in humans. Methods 
5 of SNP detection include, but are not limited to, single-stranded conformation polymorphism (SSCP) 
and fluorescent SSCP (fSSCP) methods. In SSCP, oligonucleotide primers derived from 
polynucleotides encoding NAAP are used to amplify DNA using the polymerase chain reaction 
(PGR). The DNA may be derived, for example, from diseased or normal tissue, biopsy samples, 
bodily fluids, and the like. SNPs in the DNA cause differences in the secondary and tertiary 

10 structures of PGR products in single-stranded form, and these differences are detectable using gel 
electrophoresis in non-denaturing gels. In fSGGP, the oligonucleotide primers are fluorescently 
labeled, which allows detection of the amplimers in high-throughput equipment such as DNA 
sequencing noachines. Additionally, sequence database analysis methods, termed in silico SNP 
(isSNP), are capable of identifying polymorphisms by comparing the sequence of individual 

15 overlapping DNA fragments which assemble into a conaatnon consensus sequence. These computer- 
based methods filter out sequence variations due to laboratory preparation of DNA and sequencing 
errors using statistical models and automated analyses of DNA sequence chromatograms. In the 
alternative, SNPs may be detected and characterized by mass spectrometry using, for example, the 
high throughput MASSARRAY system (Sequenom, Inc., San Diego GA). 

20 SNPs may be used to study the genetic basis of human disease. For example, at least 16 

conamon SNPs have been associated with non-insulin-dependent diabetes mellitus. SNPs are also 
useful for exainiiiing differences in disease outcomes in monogenic disorders, such as cystic fibrosis, 
sickle cell anemia, or chronic granulomatous disease. For example, variants in the mannose-binding 
lectin, MBL2, have been shown to be correlated with deleterious pulmonary outcomes in cystic 

25 fibrosis. SNPs also have utility in pharmacogenomics, the identification of genetic variants that 

influence a patient's response to a drug, such as life-threatening toxicity. For example, a variation in 
N-acetyl transferase is associated with a high incidence of peripheral neuropathy in response to the 
anti-tuberculosis drug isoniazid, while a variation in the core promoter of the ALOX5 gene results in 
diminished clinical response to treatment with an anti-asthma drug that targets the 5 -lipoxygenase 

30 pathway. Analysis of the distribution of SNPs in different populations is useful for investigating 

genetic drift, mutation, recombination, and selection, as well as for tracing the origins of populations 
and their migrations (Taylor, J.G. et al. (2001) Trends Mol. Med. 7:507-512; Kwok, P.-Y. and Z. Gu 
(1999) Mol. Med. Today 5:538-543; Nowotny, P. et al. (2001) Curr. Opin. Neurobiol. 11:637-641). 
Methods which may also be used to quantify the expression of NAAP include radiolabeling 

35 or biotinylating nucleotides, coamplification of a control nucleic acid, and interpolating results from 
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standard cvirves (Melby, P.C. et al (1993) J. IminunoL Methods 159:235-244; Duplaa, C. et al. (1993) 
Anal. Biochem. 212:229-236). The speed of quantitation of multiple samples may be accelerated by 
running the assay in a high-throughput foniiat where the oligomer or polynucleotide of interest is 
presented in various dilutions and a spectrophotometric or colorimetric response gives rapid 
5 quantitation. 

In further embodiments, oligonucleotides or longer fragments derived from any of the 
polynucleotides described herein may be used as elements on a microarray. The microarray can be 
used in transcript imaging techniques which monitor the relative expression levels of large numbers 
of genes simultaneously as described below. The microarray may also be used to identify genetic 

10 variants, mutations, and polymorphisms. This information may be used to determine gene function, 
to understand the genetic basis of a disorder, to diagnose a disorder, to monitor 
progression/regression of disease as a function of gene expression, and to develop and monitor the 
activities of therapeutic agents in the treatment of disease. In particular, this information may be used 
to develop a pharmacogenbmic profile of a patient in order to select the most appropriate and 

15 effective treatment regimen for that patient. For example, therapeutic agents which are highly 
effective and display the fewest side effects may be selected for a patient based on his/her 
pharmacogenomic profile. 

In another embodiment, NAAP, fragments of NAAP, or antibodies specific for NAAP may be 
used as elements on a noicroarray. The microarray may be used to monitor or measxire protein-protein 

20 interactions, drug-target interactions, and gene expression profiles, as described above. 

A particular embodiment relates to the use of the polynucleotides of the present invention to 
generate a transcript image of a tissue or cell type. A transcript image represents the global pattern of 
gene expression by a particular tissue or cell type. Global gene expression patterns are analyzed by 
quantifying the number of expressed genes and their relative abundance under given conditions and at 

25 a given time (Seilhamer et aL, "Comparative Gene Transcript Analysis," U.S. Patent No. 5,840,484; 
hereby expressly incorporated by reference herein). Thus a transcript image may be generated by 
hybridizing the polynucleotides of the present invention or their complements to the totality of 
transcripts or reverse transcripts of a particular tissue or cell type. In one embodiment, the 
hybridization takes place in high-throughput format, wherein the polynucleotides of the present 

30 invention or their complements comprise a subset of a plurality of elements on a ndcroarray. The 
resultant transcript image would provide a profile of gene activity. 

Transcript images may be generated using transcripts isolated from tissues, cell lines, 
biopsies, or other biological samples. The transcript image may thus reflect gene expression in vivo, 
as in the case of a tissue or biopsy sample, or in vitro, as in the case of a cell line. 

35 Transcript images which profile the expression of the polynucleotides of the present 
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invention may also be used in conjvmction with in vitro model systems and preclinical evaluation of 
pharmaceuticals, as well as toxicological testing of industrial and naturally-occurring enviromnental 
compounds. All compounds induce characteristic gene expression pattems, frequently termed 
molecular fingerprints or toxicant signatures, which are indicative of mechanisms of action and 
5 toxicity (Nuwaysir, E.F. et al. (1999) MoL Carcinog. 24:153-159; Steiner, S. and N.L. Anderson 
(2000) Toxicol. Lett. 1 12-1 13:467-471). If a test compound has a signature similar to that of a 
compound with known toxicity, it is likely to share those toxic properties. These fingerprints or 
signatures are most useful and refined when they contain expression information from a large number 
of genes and gene families. Ideally, a genome-wide measurement of expression provides the highest 

10 quality signature. Even genes whose expression is not altered by any tested compounds are important 
as well, as the levels of expression of these genes are used to normalize the rest of the expression 
data. The normalization procedure is useful for comparison of expression data after treatment with 
different compounds. While the assignment of gene function to elements of a toxicant signature aids 
in interpretation of toxicity mechanisms, knowledge of gene function is not necessary for the 

15 statistical matching of signatures which leads to prediction of toxicity (see, for example. Press 

Release 00-02 from the National Institute of Environmental Health Sciences, released February 29, 
2000, available at http://www.niehs.nih.gov/oc/news/toxchip.htm). Therefore, it is important and 
desirable in toxicological screening using toxicant signatures to include all expressed gene sequences. 
In an embodiment, the toxicity of a test compound can be assessed by treating a biological 

20 sample containing nucleic acids with the test compound. Nucleic acids that are expressed in the 
treated biological sample are hybridized with one or more probes specific to the polynucleotides of 
the present invention, so that transcript levels corresponding to the polynucleotides of the present 
invention may be quantified. The transcript levels in the treated biological sample are compared with 
levels in an untreated biological sample. Differences in the transcript levels between the two samples 

25 are indicative of a toxic response caused by the test compound in the treated sample. 

Another embodiment relates to the use of the polypeptides disclosed herein to analyze the 
proteome of a tissue or cell type. The term proteome refers to the global pattern of protein expression 
in a particular tissue or cell type. Each protein component of a proteome can be subjected 
individually to further anal^^sis. Proteome expression pattems, or profiles, are analyzed by 

30 quantifying the number of expressed proteins and their relative abundance under given conditions and 
at a given time. A profile of a cell's proteome may thus be generated by separating and analyzing the 
polypeptides of a particular tissue or cell type. In one embodiment, the separation is achieved using 
two-dimensional gel electrophoresis, in which proteins from a sample are separated by isoelectric 
focusing in the first dimension, and then according to molecular weight by sodium dodecyl sulfate 

35 slab gel electrophoresis in the second dimension (Steiner and Anderson, supra). The proteins are 
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visualized in the gel as discrete and uniquely positioned spots, typically by staining the gel with an 
agent such as Coomassie Blue or silver or fluorescent stains. The optical density of each protein spot 
is generally proportional to the level of the protein in the sample. The optical densities of 
equivalenliy positioned protein spots from different samples, for example, from biological samples 
5 either treated or untreated with a test compoxmd or therapeutic agent, are compared to identify any 
changes in protein spot density related to the treatment. The proteins in the spots are partially 
sequenced using, for example, standard methods employing chemical or enzymatic cleavage followed 
by mass spectrometry. The identity of the protein in a spot may be determined by comparing its 
partial sequence, preferably of at least 5 contiguous anaino acid residues, to the polypeptide sequences 

10 of interest. In some cases, further sequence data naay be obtained for definitive protein identification. 

A proteomic profile may also be generated using antibodies specific for NAAP to quantify 
the levels of NAAP expression. In one embodiment, the antibodies are used as elements on a 
microarray, and protein expression levels are quantified by exposing the microarray to the sample and 
detecting the levels of protein bound to each array element (Lueking, A. et al. (1999) Anal. Biochem. 

15 270: 103-1 11; Mendoze, L.G. et al. (1999) Biotechniques 27:778-788). Detection may be performed 
by a variety of methods known in the art, for example, by reacting the proteins in the sample with a 
thiol- or amino-reactive fluorescent compound and detecting the amount of fluorescence boimd at 
each array element. 

Toxicant signatures at the proteome level are also useful for toxicological screening, and 
20 should be analyzed in parallel with toxicant signatures at the transcript level. There is a poor 

correlation between transcript and protein abundances for some proteins in some tissues (Anderson, 
N.L. and J. Seilhamer (1997) Electrophoresis 18:533-537), so proteome toxicant signatures may be 
useful in the analysis of compounds which do not significantly affect the transcript image, but which 
alter the proteomic profile, hi addition, the analysis of transcripts in body fluids is difficult, due to 
25 rapid degradation of noRNA, so proteomic profiling may be more reliable and informative in such 
cases. 

In another embodiment, the toxicity of a test compound is assessed by treating a biological 
sample containing proteins with the test compound. Proteins that are expressed in the treated 
biological sample are separated so that the amount of each protein can be quantified. The amount of 

30 each protein is compared to the amount of the corresponding protein in an untreated biological 
sample, A difference in the amount of protein between the two samples is indicative of a toxic 
response to the test compound in the treated sample. Individual proteins are identified by sequencing 
the amino acid residues of the individual proteins and comparing these partial sequences to the 
polypeptides of the present invention. 

35 In another embodiment, the toxicity of a test compound is assessed by treating a biological 
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sample containing proteins with the test compound. Proteins from the biological sample are 
incubated with antibodies specific to the polypeptides of the present invention. The amount of 
protein recognized by the antibodies is quantified. The amount of protein in the treated biological 
sample is compared with the amount in an imtreated biological sample. A difference in the amount of 
5 protein between the two samples is indicative of a toxic response to the test compound in the treated 
sample. 

Microarrays may be prepared, used, and analyzed using methods known in the art (Brennan, 
T.M. et al. (1995) U.S. Patent No. 5,474,796; Schena, M. et al. (1996) Proc. Natl. Acad. Sci. USA 
93:10614-10619; Baldeschweiler et al. (1995) PCT application W095/251116; Shalon, D. et al. 

10 (1995) PCT appUcation WO95/35505; HeUer, R.A. et al. (1997) Proc. Natl. Acad. Sci. USA 94:2150- 
2155; Heller, M.J. et al. (1997) U.S. Patent No. 5,605,662). Various types of microarrays are well 
known and thoroughly described in Schena, M., ed. (1999; DNA Microarrays: A Practical Approach , 
Oxford University Press, London). 

In another embodiment of the invention, nucleic acid sequences encoding NAAP may be used 

15 to generate hybridization probes useful in mapping the naturally occurring genomic sequence. Either 
coding or noncoding sequences may be used, and in some instances, noncoding sequences may be 
preferable over coding sequences. For example, conservation of a coding sequence among members 
of a multi-gene family may potentially cause undesired cross hybridization during chromosomal 
mapping. The sequences may be mapped to a particular chromosome, to a specific region of a 

20 chromosome, or to artificial chromosome constructions, e.g., human artificial chromosomes (HACs), 
yeast artificial chromosomes (YACs), bacterial artificial chromosomes (B AGs), bacterial PI 
constructions, or single chromosome cDNA libraries (Harrington, J.J. et al. (1997) Nat. Genet. 
15:345-355; Price, C.M. (1993) Blood Rev. 7:127-134; Trask, B.J. (1991) Trends Genet. 7:149-154). 
Once mapped, the nucleic acid sequences may be used to develop genetic linkage maps, for example, 

25 which correlate the inheritance of a disease state with the inheritance of a particular chromosome 
region or restriction fragment length polymorphism (RFLP) (Lander, E.S. and D. Botstein (1986) 
Proc. Natl. Acad. Sci. USA 83:7353-7357). 

Fluorescent in situ hybridization (FISH) may be correlated with other physical and genetic 
map data (Heinz-Ulrich, et al. (1995) in Meyers, supra, pp. 965-968). Examples of genetic map data 

30 can be found in various scientific journals or at the Online Mendelian Inheritance in Man (OMBVI) 
World Wide Web site. Correlation between the location of the gene encoding NAAP on a physical 
map and a specific disorder, or a predisposition to a specific disorder, may help define the region of 
DNA associated with that disorder and thus may further positional cloning efforts. 

In situ hybridization of chromosomal preparations and physical mapping techniques, such as 

35 linkage analysis using established chromosomal markers, may be used for extending genetic maps. 
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Often the placement of a gene on the chromosome of another mammalian species, such as mouse, 
may reveal associated markers even if the exact chromosomal locus is not known. This information is 
valuable to investigators searching for disease genes using positional cloning or other gene discovery 
techniques. Once the gene or genes responsible for a disease or sjntidrome have been crudely 
5 localized by genetic litikage to a particular genomic region, e.g., ataxia-telangiectasia to llq22-23, 
any isequences mapping to that area may represent associated or regulatory genes for further 
investigation (Gatti, R.A. et al. (1988) Nature 336:577-580). The nucleotide sequence of the instant 
invention may also be used to detect diflferences in the chromosonaal location due to translocation, 
inversion, etc., among normal, carrier, or affected individuals. 

10 In another embodiment of the invention, NAAP, its catalytic or immunogenic fragments, or 

oligopeptides thereof can be used for screening libraries of compounds in any of a variety of drug 
screening techniques. The fragment employed in such screening may be free in solution, affixed to a 
solid support, borne on a cell surface, or located intracellularly. The formation of binding complexes 
between NAAP and the agent being tested may be measured. 

15 Another technique for drug screening provides for high throughput screening of compounds 

having suitable binding affinity to the protein of interest (Geysen, et al. (1984) PCT application 
WO84/03564). In this method, large numbers of different small test compounds are synthesized on a 
solid substrate. The test compounds are reacted with NAAP, or fragments thereof, and washed. 
Bound NAAP is then detected by methods well known in the art. Purified NAAP can also be coated 

20 directly onto plates for use in the aforementioned drug screening techniques. Alternatively, 

non-neutralizing antibodies can be used to capture the peptide and inomobilize it on a solid support. 

In another embodiment, one noay use competitive drug screening assays in which neutralizing 
antibodies capable of binding NAAP specifically compete with a test compound for binding NAAP. 
In this manner, antibodies can be used to detect the presence of any peptide which shares one or more 

25 antigenic determinants with NAAP. 

In additional embodiments, the nucleotide sequences which encode NAAP may be used in 
any molecular biology techniques that have yet to be developed, provided the new techniques rely on 
properties of nucleotide sequences that are currently known, including, but not limited to, such 
properties as the triplet genetic code and specific base pair interactions. 

30 Without further elaboration, it is believed that one skilled in the art can, using the preceding 

description, utilize the present invention to its fullest extent. The following embodiments are, 
therefore, to be construed as merely illustrative, and not limitative of the remainder of the disclosure 
in any way whatsoever. 

The disclosures of all patents, applications and publications, mentioned above and below, 

35 including U.S. Ser. No. 60/305,089, U.S. Ser. No. 60/305,104, U.S. Ser. No. 60/305,325, U.S. Ser. 
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No. 60/305,390, U.S. Ser. No. 60/306,694, U.S. Ser. No. 60/306,960, and U.S. Ser. No. 60/308,170, 
are expressly incorporated by reference herein. 

EXAMPLES 

5 I. Construction of cDNA Libraries 

Incyte cDNAs were derived firom cDNA libraries described in the LIFESEQ GOLD database 
(Incyte Genomics, Palo Alto CA). Some tissues were homogenized and lysed in guanidinium 
isothiocyanate, while others were homogenized and lysed in phenol or in a suitable mixture of 
denaturants, such as TRIZOL (Ihvitrogen), a monophasic solution of phenol and guanidine 

10 isothiocyanate. The resulting lysates were centrifiiged over CsCl cushions or extracted with 

chloroform. RNA was precipitated from the lysates with either isopropanol or sodium acetate and 
ethanol, or by other routine methods. 

Phenol extraction and precipitation of RNA were repeated as necessary to increase RNA 
purity. In some cases, RNA was treated with DNase. For most libraries, poly(A)+ RNA was isolated 

15 using oligo d(T)-coupled paramagnetic particles (Promega), OLIGOTEX latex particles (QIAGEN, 
Chatsworth CA), or an OLIGOTEX mRNA purification kit (QIAGEN). Alternatively, RNA was 
isolated directly from tissue lysates using other RNA isolation kits, e.g., the POLY(A)PURE mRNA 
purification kit (Ambion, Austin TX). 

In some cases, Stratagene was provided with RNA and constructed the corresponding cDNA 

20 libraries. Otherwise, cDNA was synthesized and cDNA libraries were constructed with the UNIZAP 
vector system (Stratagene) or SUPERSCRIPT plasmid system (Invitrogen), using the recommended 
procedures or similar methods known in the art (Ausubel et al., supra,, ch. 5). Reverse transcription 
was initiated using oligo d(T) or randoin primers. Synthetic oligonucleotide adapters were ligated to 
double stranded cDNA, and the cDNA was digested with the appropriate restriction enzyme or 

25 enzymes. For most libraries, the cDNA was size-selected (300-1000 bp) using SEPHACRYL SIOOO, 
SEPHAROSE CL2B, or SEPHAROSE CL4B column chromatography (Amersham Biosciences) or 
preparative agarose gel electrophoresis. cDNAs were ligated into compatible restriction enzyme sites 
of the polylinker of a suitable plasmid, e.g., PBLUESCRIPT plasmid (Stratagene), PSPORTl plasmid 
(Invitrogen), PCDNA2.1 plasmid (Invitrogen, Carlsbad CA), PBK-CMV plasmid (Stratagene), PCR2- 

30 TOPOTA plasmid (Invitrogen), PCMV-ICIS plasmid (Stratagene), pIGEN (Incyte Genomics, Palo 
Alto CA), pRARE (Incyte Genomics), or pINCY (Licyte Genomics), or derivatives thereof. 
Recombinant plasmids were transformed into competent E. coli cells including XLl-Blue, XLl- 
BlueMRF, or SOLR from Stratagene or DH5a, DHIOB, or ElectroMAX DHIOB from lavitrogen. 
II. Isolation of cDNA Clones 

35 Plasmids obtained as described in Example I were recovered from host cells by in vivo 
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excision using the UNIZAP vector system (Stratagene) or by cell lysis. Plasmids were purified using 
at least one of the following: a Magic or WIZARD Minipreps DNA purification system (Promega); an 
AGTC Miniprep pmlfication kit (Edge Biosystenas, Gaithersburg MD); and QIAWELL 8 Plasmid, 
QIAWELL 8 Plus Plasmid, QIAWELL 8 Ultra Plasmid purification systems or the R.E. A.L. PREP 96 
5 plasmid purification kit from QIAGEN. Following precipitation, plasmids were resuspended in 0. 1 
nal of distilled water and stored, with or without lyophilization, at 4°C. 

Alternatively, plasmid DNA was amplified from host cell lysates using direct link PGR in a 
high-throughput fomaat (Rao, V,B. (1994) Anal. Biochenci. 216:1-14). Host cell lysis and thermal 
cycling steps were carried out in a single reaction mixture. Samples were processed and stored in 

10 384-well plates, and the concentration of amplified plasmid DNA was quantified fluorometrically 
using PICOGREEN dye (Molecular Probes, Eugene OR) and a FLUOROSKAN H fluorescence 
scanner (Labsystems Oy, Helsinki, Finland). 
III. Sequencing and Analysis 

Incyte cDNA recovered in plasmids as described in Example n were sequenced as follows. 

15 Sequencing reactions were processed using standard methods or high-throughput instrumentation 
such as the ABI CATALYST 800 (Applied Biosystems) thermal cycler or the PTC-200 thermal 
cycler (MJ Research) in conjunction with the HYDRA microdispenser (Robbins Scientific) or the 
MICROLAB 2200 (Hamilton) liquid transfer system. cDNA sequencing reactions were prepared 
using reagents provided by Amersham Biosciences or supplied in ABI sequenciag kits such as the 

20 ABI PRISM BIGDYE Terminator cycle sequencing ready reaction kit (Applied Biosystems). 

Electrophoretic separation of cDNA sequencing reactions and detection of labeled polynucleotides 
were carried out using the MEGABACE 1000 DNA sequencing system (Amersham Biosciences); the 
ABI PRISM 373 or 377 sequencing system (Applied Biosystenas) in conjunction with standard ABI 
protocols and base calling software; or other sequence analysis systems known in the art. Reading 

25 firames within the cDNA sequences were identified using standard methods (Ausubel et al., supra, ch. 
7), Some of the cDNA sequences were selected for extension using the techniques disclosed in 
Example Vm. 

The polynucleotide sequences derived firom Incyte cDNAs were validated by removing 
vector, linker, and poly (A) sequences and by masking ambiguous bases, using algorithms and 

30 programs based on BLAST, dynamic programming, and dinucleotide nearest neighbor analysis. The 
Incyte cDNA sequences or translations thereof were then queried against a selection of public 
databases such as the GenBank primate, rodent, mammalian, vertebrate, and eukaryote databases, and 
BLOCKS, PRINTS, DOMO, PRODOM; PROTEOME databases with sequences fiom Homo sapiens, 
Rattus norvegicus, Mus musculus, Caenorhabditis elegans, Saccharomyces cerevisiae, 

35 Schizosaccharomyces pombe, and Candida albicans (Incyte Genomics, Palo Alto CA); hidden 
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Markov model (HMM)-based protein family databases such as PFAM, INCY, and TIGRFAM (Haft, 
D.H. et al. (2001) Nucleic Acids Res. 29:41-43); and HMM-based proteia domain databases such as 
SMART (Schultz, J. et al. (1998) Proc. Natl. Acad. Sci. USA 95:5857-5864; Letunic, 1. et al. (2002) 
Nucleic Acids Res. 30:242-244). (HMM is a probabilistic approach which analyzes consensus 
5 primary structures of gene families; see, for example, Eddy, S.R, (1996) Curr. Opin. Struct. Biol. 
6:361-365.) The queries were performed usiug programs based on BLAST, FASTA, BLIMPS, and 
HMMER. The Licyte cDNA sequences were assembled to produce full length polynucleotide 
sequences. Alternatively, GenBank cDNAs, GenBankESTs, stitched sequences, stretched sequences, 
or Genscan-predicted coding sequences (see Examples IV and V) were used to extend Ihcyte cDNA 

10 assemblages to full length. Assembly was performed using programs based on Phred, Phrap, and 
Consed, and cDNA assemblages were screened for open reading frames using programs based on 
GeneMark, BLAST, and FASTA. The fuU length polynucleotide sequences were translated to derive 
the corresponding full length polypeptide sequences. Alternatively, a polypeptide may begin at any 
of the methionine residues of the fuU length translated polypeptide. Full length polypeptide 

15 sequences were subsequently analyzed by querying agaiust databases such as the GenBank protein 
databases (genpept), SwissProt, the PROTEOME databases, BLOCKS, PRINTS, DOMO, PRODOM, 
Prosite, hidden Markov model (HMM)-based protein family databases such as PFAM, INCY, and 
TIGRFAM; and HMM-based protein domain databases such as SMART. Full length polynucleotide 
sequences are also analyzed using MACDNASIS PRO software (Hitachi Software Engineering, 

20 South San Francisco CA) and LASERGENE software (DNASTAR). Polynucleotide and polypeptide 
sequence alignments are generated using default parameters specified by the CLUSTAL algorithm as 
incorporated into the MEGALIGN multisequence alignment program (DNASTAR), which also 
calculates the percent identity between aligned sequences. 

Table 7 smmnarizes the tools, programs, and algorithms used for the analysis and assembly of 

25 Incyte cDNA and full length sequences and provides applicable descriptions, references, and 

threshold parameters. The first column of Table 7 shows the tools, programs, and algorithms used, 
the second column provides brief descriptions thereof, the third column presents appropriate 
references, all of which are incorporated by reference herein in their entirety, and the fourth column 
presents, where applicable, the scores, probability values, and other parameters used to evaluate the 

30 strength of a match between two sequences (the higher the score or the lower the probability value, 
the greater the identity between two sequences). 

The programs described above for the assembly and analysis of full length polynucleotide 
and polypeptide sequences were also used to identify polynucleotide sequence fragments from SEQ 
ID NO:36-70. Fragments from about 20 to about 4000 nucleotides which are useful in hybridization 

35 and amplification technologies are described in Table 4, column 2. 
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lY. Identification and Editing of Coding Sequences from Genomic DNA 

Putative nucleic acid-associated proteins were initially identified by running the Genscan 
gene identification program against public genomic sequence databases (e.g., gbpri and gbhtg). 
Genscan is a general-purpose gene identification program which analyzes genomic DNA sequences 
5 firom a variety of organisms (Burge, C. and S. Karlin (1997) J. Mol. Biol. 268:78-94; Burge, C. and S. 
Karlin (1998) Curr. Opin. Struct. Biol. 8:346-354). The program concatenates predicted exons to 
form an assembled cDNA sequence extending firom a methionine to a stop codon. The output of 
Genscan is a FASTA database of polynucleotide and polypeptide sequences. The maximum range of 
sequence for Genscan to analyze at once was set to 30 kb. To determine which of these Genscan 

10 predicted cDNA sequences encode nucleic acid-associated proteins, the encoded polypeptides were 
analyzed by querying against PFAM models for nucleic acid-associated proteins. Potential nucleic 
acid-associated proteins were also identified by homology to Ihcyte cDNA sequences that had been 
annotated as nucleic acid-associated proteins. These selected Genscan-predicted sequences were then 
compared by BLAST analysis to the genpept and gbpri public databases. Where necessary, the 

15 Genscan-predicted sequences were then edited by comparison to the top BLAST hit fi:om genpept to 
correct errors in the sequence predicted by Genscan, such as extra or omitted exons. BLAST analysis 
was also used to find any Incyte cDNA or public cDNA coverage of the Genscan-predicted 
sequences, thus providing evidence for transcription. When Incyte cDNA coverage was available, 
this information was used to correct or confirm the Genscan predicted sequence. Full length 

20 polynucleotide sequences were obtained by assembling Genscan-predicted coding sequences with 
Incyte cDNA sequences and/or public cDNA sequences using the assembly process described in 
Example HI. Alternatively, full length polynucleotide sequences were derived entirely from edited or 
unedited Genscan-predicted coding sequences. 

V. Assembly of Genomic Sequence Data with cDNA Sequence Data 
25 "Stitched'' Sequences 

Partial cDNA sequences were extended with exons predicted by the Genscan gene 
identification program described in Example IV. Partial cDNAs assembled as described in Example 
in were mapped to genomic DNA and parsed into clusters containing related cDNAs and Genscan 
exon predictions from one or more genomic sequences. Each cluster was analyzed using an algorithm 

30 based on graph theory and dynamic programming to integrate cDNA and genomic inforaiation, 

generating possible splice variants that were subsequently confirmed, edited, or extended to create a 
full length sequence. Sequence intervals in which the entire length of the interval was present on 
more than one sequence in the cluster were identified, and intervals thus identified were considered to 
be equivalent by transitivity. For example, if an interval was present on a cDNA and two genomic 

35 sequences, then all three intervals were considered to be equivalent. This process allows umelated 
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but consecutive genomic sequences to be brought together, bridged by cDNA sequence. Intervals 
thus identified were then "stitched" together by the stitching algorithm in the order that they appear 
along their parent sequences to generate the longest possible sequence, as well as sequence variants. 
Linkages between intervals which proceed along one type of parent sequence (cDNA to cDNA or 
5 genomic sequence to genomic sequence) were given preference over linkages which change parent 
type (cDNA to genonoic sequence). The resultant stitched sequences were translated and compared 
by BLAST analysis to the genpept and gbpri public databases. Incorrect exons predicted by Genscan 
were corrected by comparison to the top BLAST hit from genpept. Sequences were further extended 
with additional cDNA sequences, or by inspection of genomic DNA, when necessary. 

10 ^^Stretched" Sequences 

Partial DNA sequences were extended to full length with an algorithm based on BLAST 
analysis. First, partial cDNAs assembled as described in Example HI were queried against public 
databases such as the GenBank primate, rodent, naammalian, vertebrate, and eukaryote databases 
using the BLAST program. The nearest GenBank protein homolog was then compared by BLAST 

15 analysis to either Incyte cDNA sequences or GenScan exon predicted sequences described in 

Example IV. A chimeric protein was generated by using the resultant high-scoring segment pairs 
(HSPs) to map the translated sequences onto the GenBank pfotein homolog. Insertions or deletions 
may occur in the chimeric protein with respect to the original GenBank protein homolog. The 
GenBank protein homolog, the chimeric protein, or both were used as probes to search for 

20 homologous genomic sequences from the public human genome databases. Partial DNA sequences 
were therefore "stretched" or extended by the addition of homologous genomic sequences. The 
resultant stretched sequences were examined to detemadne whether it contained a complete gene. 
VI. Chromosomal Mapping of NAAP Encoding Polynucleotides 

The sequences which were used to assemble SEQ ID NO: 36-70 were compared with 

25 sequences from the Incyte LIFESEQ database and public domain databases using BLAST and other 
implementations of the Smith-Waterman algorithm. Sequences from these databases that matched 
SEQ ID NO:36-70 were assembled into clusters of contiguous and overlapping sequences using 
assembly algorithms such as Phrap (Table 7). Radiation hybrid and genetic mapping data available 
from public resources such as the Stanford Hunoian Genome Center (SHGC), Whitehead Institute for 

30 Genome Research (WIGR), and G6nethon were used to determine if any of the clustered sequences 
had been previously mapped. Inclusion of a mapped sequence in a cluster resulted in the assignment 
of all sequences of that cluster, including its particular SEQ ID NO:, to that map location. 

Map locations are represented by ranges, or intervals, of human chromosomes. The map 
position of an interval, in centiMorgans, is measured relative to the terminus of the chromosome's p- 

35 arm. (The centiMorgan (cM) is a unit of measurement based on recombination frequencies between 
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chromosomal markers. On average, 1 cM is roughly equivalent to 1 megabase (Mb) of DNA in 
humans, although this can vary widely due to hot and cold spots of recombination.) The cM distances 
are based on genetic markers mapped by Genethon which provide boundaries for radiation hybrid 
markers whose sequences were included in each of the clusters. Human genome maps and other 
5 resources available to the public, such as the NCBI "GeneMap'99" World Wide Web site 

(http://www.ncbi.nlm.nih.gov/genemap/), can be employed to determine if previously identified 
disease genes nnap within or in proximity to the intervals indicated above. 
VII. Analysis of Polynucleotide Expression 

Northern analysis is a laboratory technique used to detect the presence of a transcript of a 
10 gene and involves the hybridization of a labeled nucleotide sequence to a membrane on which RNAs 
from a particular cell type or tissue have been bound (Sambrook, supra, ch. 7; Ausubel et al., supra, 
ch. 4). 

Analogous computer techniques applying BLAST were used to search for identical or related 
molecules in cDNA databases such as GenBank or LIFESEQ (Incyte Genomics). This analysis is 
15 much faster than multiple membrane-based hybridizations. In addition, the sensitivity of the 

computer search can be modified to determine whether any particular match is categorized as exact or 
similar. The basis of the search is the product score, which is defined as: 

BLAST Score x Percent Identity 
20 5 X minimum {length(Seq. 1), length(Seq. 2)} 

The product score takes into account both the degree of similarity between two sequences and the 
length of the sequence match. The product score is a normahzed value between 0 and 100, and is 
calculated as follows: the BLAST score is multiplied by the percent nucleotide identity and the 

25 product is divided by (5 times the length of the shorter of the two sequences). The BLAST score is 
calculated by assigning a score of +5 for every base- that matches in a high-scoring segment pair 
(HSP), and -4 for every mismatch. Two sequences nnay share more than one HSP (separated by 
gaps). If there is more than one HSP, then the pair with the highest BLAST score is used to calculate 
the product score. The product score represents a balance between fractional overlap and quality in a 

30 BLAST alignment. For example, a product score of 100 is produced only for 100% identity over the 
entire length of the shorter of the two sequences being compared. A product score of 70 is produced 
either by 100% identity and 70% overlap at one end, or by 88% identity and 100% overlap at the 
other. A product score of 50 is produced either by 100% identity and 50% overlap at one end, or 79% 
identity and 100% overlap. 

35 Alternatively, polynucleotides encoding NAAP are analyzed with respect to the tissue sources 
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from which they were derived. For example, some fall length sequences are assembled, at least in 
part, with overlapping Ihcyte cDNA sequences (see Example EI). Each cDNA sequence is derived 
from a cDNA library constructed from a human tissue. Each human tissue is classified into one of the 
following organ/tissue categories: cardiovascular system; connective tissue; digestive system; 
5 embryonic structures; endocrine system; exocrine glands; genitalia, female; genitalia, male; germ 
cells; hemic and immune system; liver; musculoskeletal system; nervous system; pancreas; 
respiratory system; sense organs; skin; stomatognathic system; unclassified/mixed; or urinary tract. 
The nimiber of libraries ia each category is counted and divided by the total number of libraries 
across all categories. Similarly, each human tissue is classified into one of the following 

10 disease/condition categories: cancer, cell line, developmental, inflammation, neurological, trauma, 
cardiovascular, pooled, and other, and the nxmiber of libraries in each category is counted and divided 
by the total number of libraries across all categories. The resulting percentages reflect the tissue- and 
disease-specific expression of cDNA encoding NAAP. cDNA sequences and cDNA library/tissue 
information are found in the LIFESEQ GOLD database (Ihcyte Genomics, Palo Alto CA). 

15 VIII. Extension of NAAP Encoding Polynucleotides 

Full length polynucleotides are produced by extension of an appropriate fragment of the full 
lenglii molecule using oligonucleotide primers designed from this fragment. One primer was 
synthesized to initiate 5' extension of the known fragment, and the other primer was synthesized to 
initiate 3' extension of the known fragment. The initial primers were designed using OLIGO 4.06 

20 software (National Biosciences), or another appropriate program, to be about 22 to 30 nucleotides in 
length, to have a GC content of about 50% or more, and to anneal to the target sequence at 
temperatures of about 68 °C to about 72°C. Any stretch of nucleotides which would result in hairpin 
structures and primer-primer dimerizations was avoided. 

Selected human cDNA libraries were used to extend the sequence. If more than one 

25 extension was necessary or desired, additional or nested sets of primers were designed. 

High fidelity amplification was obtaiued by PGR using methods well known in the art. PGR 
was performed in 96-well plates using the PTC-200 thermal cycler (MJ Research, Inc.). The reaction 
mix contained DNA template, 200 nmol of each primer, reaction buffer containing Mg^^, (NH4)2S04, 
and 2-mercaptoethanol, Taq DNA polymerase (Amersham Biosciences), ELONGASE enzyme 

30 (Invitrogen), and Pfu DNA polymerase (Stratagene), with the following parameters for primer pair 
PCI A and PCI B: Step 1: 94^C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°C, 1 min; Step 4: 68°C, 2 
min; Step 5: Steps 2, 3, and 4 repeated 20 times; Step 6: 68 °C, 5 min; Step 7: storage at 4°C. In the 
alternative, the parameters for primer pair T7 and SK+ were as follows: Step 1: 94°C, 3 min; Step 2: 
94°C, 15 sec; Step 3: 57 °C, 1 min; Step 4: 68 °C, 2 min; Step 5: Steps 2, 3, and 4 repeated 20 times; 

35 Step 6: 68 °C, 5 min; Step 7: storage at 4°C. 
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The concentration of DNA in each well was determined by dispensing 100 [il PICOGREEN 
quantitation reagent (0.25% (v/v) PICOGREEN; Molecular Probes, Eugene OR) dissolved in IX TE 
and 0.5 /xl of undiluted PGR product into each well of an opaque fluorimeter plate (Coming Costar, 
Acton MA), allowing the DNA to bind to the reagent. The plate was scanned in a Huoroskan n 
5 (Labsystems Oy, Helsinki, Finland) to measure the fluorescence of the sample and to quantify the 
concentration of DNA. A 5 /^l to 10 jjI aliquot of the reaction mixture was analyzed by 
electrophoresis on a 1 % agarose gel to determine which reactions were successful in extending the 
sequence. 

The extended nucleotides were desalted and concentrated, transferred to 384-well plates, 

10 digested with CviJI cholera virus endonuclease (Molecular Biology Research, Madison WI), and 
sonicated or sheared prior to religation into pUC 18 vector (Amersham Biosciences). For shotgim 
sequencing, the digested nucleotides were separated on low concentration (0.6 to 0.8%) agarose gels, 
fragments were excised, and agar digested with Agar ACE (Promega). Extended clones were 
religated usiag T4 Ugase (New England Biolabs, Beverly MA) into pUC 18 vector (Amersham 

15 Biosciences), treated with Pfia DNA polymerase (Stratagene) to fill-in restriction site overhangs, and 
transfected into competent jB. coU cells. Transformed cells were selected on antibiotic-containing 
media, and individual colonies were picked and cultured ovemight at 37 ^'C in 384-well plates in 
LB/2x carb liquid media. 

The cells were lysed, and DNA was amplified by PGR using Taq DNA polymerase 

20 (Amersham Biosciences) and Pfu DNA polymerase (Stratagene) with the following parameters: Step 
1: 94^C, 3 min; Step 2: 94°C, 15 sec; Step 3: 60°G, 1 min; Step 4: 72°C, 2 min; Step 5: steps 2, 3, 
and 4 repeated 29 times; Step 6: 72 ""C, 5 min; Step 7: storage at 4°C. DNA was quantified by 
PICOGREEN reagent (Molecular Probes) as described above. Samples with low DNA recoveries 
were reamplified using the same conditions as described above. Samples were diluted with 20% 

25 dimethysulfoxide (1:2, v/v), and sequenced using DYENAMIC energy transfer sequencing primers * 
and the DYENAMIC DIRECT kit (Amersham Biosciences) or the ABI PRISM BIGDYE Terminator 
cycle sequencing ready reaction kit (Applied Biosystems). 

In like manner, full length polynucleotides are verified using the above procedure or are used 
to obtain 5' regulatory sequences using the above procedure along with oligonucleotides designed for 

30 such extension, and an appropriate genomic library. 

IX. Identincation of Single Nucleotide Polymorphisms in NA AP Encoding Polynucleotides 

Common DNA sequence variants known as single nucleotide polymorphisms (SNPs) were 
identified in SEQ ID NO:36-70 using the LIFESEQ database (Incyte Genomics). Sequences fi:om the 
same gene were clustered together and assembled as described in Example HI, allowing the 

35 identification of all sequence variants in the gene. An algoritlnn consisting of a series of filters was 
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used to distinguish SNPs from other sequence variants. PreUminary filters removed the majority of 
basecall errors by requiring a minimum Phred quality score of 15, and removed sequence alignment 
errors and errors resulting from improper trimming of vector sequences, chimeras, and splice variants. 
An automated procedure of advanced chromosome analysis analysed the original chronciatogram files 
5 in the vicinity of the putative SNP. Clone error filters used statistically generated algorithms to 

identify errors introduced during laboratory processing, such as those caused by reverse transcriptase, 
polymerase, or somatic mutation. Clustering error filters used statistically generated algorithms to 
identify errors resulting from clustering of close homologs or pseudogenes, or due to contamination 
by non-hunaan sequences. A final set of filters removed duplicates and SNPs found in 

10 immunoglobulins or T-cell receptors. 

Certain SNPs were selected for further characterization by mass spectrometry using the high 
throughput MASSARRAY system (Sequenom, Inc.) to analyze allele frequencies at the SNP sites in 
four different hxmcian populations. The Caucasian population comprised 92 individuals (46 male, 46 
female), including 83 fi:om,Utah, four French, three Venezualan, and two Amish individuals. The 

15 African population comprised 194 individuals (97 male, 97 fenoale), all African Americans. The 
Hispanic population comprised 324 individuals (162 male, 162 female), all Mexican Hispanic. The 
Asian population comprised 126 individuals (64 male, 62 female) with a reported parental breakdown 
of 43% Chinese, 31% Japanese, 13% Korean, 5% Vietnamese, and 8% other Asian. Allele 
frequencies were first analyzed iu the Caucasian population; in some cases those SNPs which showed 

20 no allelic variance in this population were hot further tested in the other three populations. 
X. Labeling and Use of Individual Hybridization Probes 

Hybridization probes derived from SEQ ID NO:36-70 are employed to screen cDNAs, 
genomic DNAs, or mRNAs. Although the labeling of oligonucleotides, consisting of about 20 base 
pairs, is specifically described, essentially the same procedure is used with larger nucleotide 

25 fragments. Oligonucleotides are designed using state-of-the-art software such as OLIGO 4.06 
software (National Biosciences) and labeled by combining 50 pmol of each oligomer, 250 /xCi of 
[y-32p] adenosine triphosphate (Amersham Biosciences), and T4 polynucleotide kiuase (DuPont NEN, 
Boston MA). The labeled oligonucleotides are substantially purified using a SEPHADEX G-25 
superfine size exclusion dextran bead column (Amersham Biosciences). An aliquot containing 10^ 

30 counts per minute of the labeled probe is used in a typical membrane-based hybridization analysis of 
human genomic DNA digested with one of the following endonucleases: Ase I, Bgl n, Eco RI, Pst I, 
Xba I, or Pvu n (DuPont NEN). 

The DNA from each digest is fractionated on a 0.7% agarose gel and transferred to nylon 
membranes (Nytran Plus, Schleicher & Schuell, Durham NH). Hybridization is carried out for 16 

35 hours at 40 °C. To remove nonspecific signals, blots are sequentially washed at room temperature 
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under conditions of up to, for example, 0.1 x saline sodium citrate and 0.5% sodium dodecyl sulfate. 
Hybridization patterns are visualized using autoradiography or an alternative imaging means and 
compared. 

XI« Microarrays 

5 The linkage or synthesis of array elements upon a microarray can be achieved utilizing 

photolithography, piezoelectric printing (ink-jet printing; see, e.g., Baldeschweiler et al., supra), 
mechanical microspotting technologies, and derivatives thereof. The substrate in each of the 
aforementioned technologies should be uniform and solid with a non-porous surface (Schena, M., ed. 
(1999) DNA Microarravs: A Practical Approach . Oxford University Press, London). Suggested 

10 substrates include silicon, silica, glass slides, glass chips, and silicon wafers. Alternatively, a 

procedure analogous to a dot or slot blot may also be used to arrange and link elements to the surface 
of a substrate using thermal, UV, chemical, or mechanical bonding procedures. A typical array may 
be produced using available methods and machines well known to those of ordinary skill in the art 
and may contain any appropriate number of elements (Schena, M. et al. (1995) Science 270:467-470; 

15 Shalon, D. et al. (1996) Genome Res. 6:639-645; Marshall, A. and J. Hodgson (1998) Nat. 
Biotechnol. 16:27-31). 

Full length cDNAs, Expressed Sequence Tags (ESTs), or fragments or oligomers thereof may 
comprise the elements of the microarray. Fragments or oligomers suitable for hybridization can be 
selected using software well known in the art such as LASERGENE software (DNASTAR). The 

20 array elements are hybridized with polynucleotides in a biological sample. The polynucleotides in the 
biological sample are conjugated to a fluorescent label or other molecular tag for ease of detection. 
After hybridization, nonhybridized nucleotides from the biological sample are removed, and a 
fluorescence scaimer is used to detect hybridization at each array element. Alternatively, laser 
desorbtion and mass spectrometry may be used for detection of hybridization. The degree of 

25 complementarity and the relative abundance of each polynucleotide which hybridizes to an element 
on the microarray may be assessed. In one embodiment, microarray preparation and usage is 
described in detail below. 
Tissue or Cell Sample Preparation 

Total RNA is isolated from tissue samples using the guanidinium thiocyanate method and 

30 poly(A)"^ RNA is purified using the oligo-(dT) cellulose method. Each polyCA)"^ RNA sample is 

reverse transcribed using MMLV reverse-transcriptase, 0.05 pg/^l oligo-(dT) primer (21mer), IX first 
strand buffer, 0.03 units//xl RNase inhibitor, 500 juM dATP, 500 /xM dGTP, 500 /xM dTTP, 40 /xM 
dCTP, 40 /xM dCTP-Cy3 (BDS) or dCTP-Cy5 (Amersham Biosciences). The reverse transcription 
reaction is performed in a 25 ml volume containing 200 ng poly(A)'' RNA with GEMBRIGHT kits 

35 (hicyte). Specific control polyCA)"*^ RNAs are synthesized by in vitro transcription from non-coding 
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yeast genomic DNA. After incubation at C for 2 hr, each reaction sample (one with Cy3 and 
another with Cy5 labeling) is treated with 2.5 ml of 0.5M sodium hydroxide and incubated for 20 
minutes at 85° C to the stop the reaction and degrade the RNA. Samples are pxirified using two 
successive CHROMA SPIN 30 gel filtration spin columns (CLONTECH Laboratories, Inc- 
5 (CLONTECH), Palo Alto CA) and after combining, both reaction samples are ethanol precipitated 
using 1 ml of glycogen (1 mg/ml), 60 ml sodium acetate, and 300 ml of 100% ethanol. The sample is 
then dried to completion using a SpeedVAC (Savant Insfcruments lac, Holbrook NY) and 
resuspended in 14 jbil 5X SSC/0.2% SDS. 
Microarray Preparation 

10 Sequences of the present invention are used to generate array elements. Each array element is 

amplified fi-om bacterial cells containing vectors with cloned cDNA inserts. PGR amplification uses 
primers complementary to the vector sequences flanking the cDNA insert. Array elements are 
amplified in thirty cycles of PGR from an initial quantity of 1-2 ng to a final quantity greater than 5 
jug. Amplified array elements are then purified using SEPEL\GRYL-400 (Amersham Biosciences). 

15 Purified array elements are immobilized on polymer-coated glass slides. Glass microscope 

slides (Coming) are cleaned by ultrasound in 0.1% SDS and acetone, with extensive distilled water 
washes between and after treatments. Glass slides are etched ha 4% hydrofluoric acid (VWR 
Scientific Products Corporation (VWR), West Chester PA), washed extensively in distilled water, and 
coated with 0.05% aminopropyl silane (Sigma) hi 95% ethanol. Coated slides are cured in a 1 10°G 

20 oven. 

Array elements are applied to the coated glass substrate using a procedure described in U.S. 
Patent No. 5,807,522, incorporated herein by reference. 1 ^1 of the array element DNA, at an average 
concentration of 100 ng//il, is loaded into the open capillary printing element by a high-speed robotic 
apparatus. The apparatus then deposits about 5 nl of array element sample per slide. 

25 Microarrays are UV-crosslinked using a STRATALINKER UV-crosslinker (Stratagene). 

Microarrays are washed at room temperature once in 0.2% SDS and three times in distilled water. 
Non-specific binding sites are blocked by incubation of microarrays in 0.2% casein in phosphate 
buffered saline (PBS) (Tropix, Inc., Bedford MA) for 30 minutes at 60** C followed by washes in 
0.2% SDS and distilled water as before. 

30 Hybridization 

Hybridization reactions contain 9 /jtl of sample mixture consisting of 0.2 /xg each of Cy3 and 
Cy5 labeled cDNA synthesis products in 5X SSC, 0.2% SDS hybridization buffer. The sample 
mixture is heated to 65° C for 5 minutes and is aUquoted onto the microarray surface and covered with 
an 1,8 cm^ coverslip. The arrays are transferred to a waterproof chamber having a cavity just slightly 
35 larger than a microscope slide. The chamber is kept at 100% humidity internally by the addition of 
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140 fil of 5X SSC in a comer of the chamber. The chamber containing the arrays is incubated for 
about 6.5 hours at 60° C. The arrays are washed for 10 min at 45° C in a first wash buffer (IX SSC, 
0.1% SDS), three times for 10 minutes each at 45° C in a second wash buffer (O.IX SSC), and dried. 
Detection 

5 Reporter-labeled hybridization complexes are detected with a microscope equipped with an 

hanova 70 mixed gas 10 W laser (Coherent, hic, Santa Clara CA) capable of generating spectral lines 
at 488 mn for excitation of Cy3 and at 632 imi for excitation of Cy5. The excitation laser light is 
focused on the array using a 20X microscope objective (Nikon, Inc., Melville NY). The slide 
containing the array is placed on a computer-controlled X-Y stage on the microscope and raster- 

10 scanned past the objective. The 1.8 cm x 1.8 cm array used in the present example is scanned with a 
resolution of 20 micrometers. 

Ih two separate scans, a mixed gas multiline laser excites the two fluorophores sequentially. 
Emitted light is split, based on wavelength, into two photomultiplier tube detectors (PMT R1477, 
Hatnamatsu Photonics Systems, Bridgewater NJ) corresponding to the two fluorophores. Appropriate 

15 filters positioned between the array and the photomultiplier tubes are used to filter the signals. The 
emission maxima of the fluorophores used are 565 mn for Gy3 and 650 mn for Cy5. Each array is 
tjTpically scaimed twice, one scan per fluorophore using the appropriate filters at the laser source, 
although the apparatus is capable of recording the spectra from both fluorophores simultaneously. 
The sensitivity of the scans is typically calibrated using the signal intensity generated by a 

20 cDNA control species added to the sample mixture at a known concentration. A specific location on 
the array contains a complementary DNA sequence, allowing the intensity of the signal at that 
location to be correlated with a weight ratio of hybridizing species of 1:100,000. When two samples 
from different sources (e.g., representing test and control cells), each labeled with a different 
fluorophore, are hybridized to a single array for the purpose of identifying genes that are differentially 

25 expressed, the calibration is done by labeling samples of the calibrating cDNA with the two 
fluorophores and adding identical amounts of each to the hybridization mixture. 

The output of the photomultiplier tube is digitized using a 12-bit RTI-835H analog-to-digital 
(A/D) conversion board (Analog Devices, Inc., Norwood MA) installed in an IBM-compatible PC 
computer. The digitized data are displayed as an image where the signal intensity is mapped using a 

30 linear 20-color transformation to a pseudocolor scale ranging from blue (low signal) to red (high 
signal). The data is also analyzed quantitatively. Where two different fluorophores are excited and 
measured simultaneously, the data are first corrected for optical crosstalk (due to overlapping 
emission spectra) between the fluorophores using each fluorophore's emission spectmm. 

A grid is superimposed over the fluorescence signal image such that the signal from each spot 

35 is centered in each element of the grid. The fluorescence signal within each element is then 



95 



wo 03/006618 



PCT/US02/21971 



integrated to obtain a numerical value corresponding to the average intensity of the signal. The 
software used for signal analysis is the GEMTOOLS gene expression analysis program (Incyte). 
Expression 

Array elements that exhibited at least about a two-fold change in expression, a signal-to- 
5 background ratio of at least 2.5, and an element spot size of at least 40% were identified as 
differentially expressed using the GEMTOOLS program (Incyte Genomics). 

SEQ ID NO:57 showed differential expression, as determined by microarray analysis, in 
human aortic endotheUal cells (HAEC) following exposure to 10 ng/ml TNF-a for 24 and 48 hours. 
TNF-a is a pleiotropic cytokine that is known to play a central role in the mediation of inflammatory 

10 responses through activation of multiple signal transduction pathways. HAECs are primary cells 

derived from the endothelium of a human aorta. These cells were grown to 85% confluency and then 
treated with TNF-a. The expression of SEQ ID NO:57 was increased by at least two-fold in TNF-a- 
treated HAECs, as compared to untreated controls. Therefore, in various embodiments, SEQ ID 
NO: 57 can be used for one or more of the following: i) monitoring treatment of immune disorders and 

15 related diseases and conditions, ii) diagnostic assays for inmiune disorders and related diseases and 
conditions, and iii) developing therapeutics and/or other treatments for inmiune disorders and related 
diseases and conditions. 
XII. Complementary Polynucleotides 

Sequences complementary to the NAAP-encoding sequences, or any parts thereof, are used to 

20 detect, decrease, or inhibit expression of naturally occurring NAAP. Although use of 

oligonucleotides comprising from about 15 to 30 base pairs is described, essentially the same 
procedure is used with smaller or with larger sequence fragments. Appropriate oligonucleotides are 
designed using OLIGO 4.06 software (National Biosciences) and the coding sequence of NAAP. To 
inhibit transcription, a complementary oligonucleotide is designed from the most unique 5' sequence 

25 and used to prevent promoter binding to the coding sequence. To inhibit translation, a 

complementary oligonucleotide is designed to prevent ribosomal binding to the NAAP-encoding 
transcript. 

Xm. Expression of NAAP 

Expression and purification of NAAP is achieved using bacterial or virus-based expression 
30 systems. For expression of NAAP in bacteria, cDNA is subcloned into an appropriate vector 

containing an antibiotic resistance gene and an inducible promoter that directs high levels of cDNA 
transcription. Examples of such promoters include, but are not limited to, the trp4ac {tad) hybrid 
promoter and the T5 or T7 bacteriophage promoter in conjunction with the lac operator regulatory 
element. Recombinant vectors are transformed into suitable bacterial hosts, e.g., BL21(DE3). 
35 Antibiotic resistant bacteria express NAAP upon induction with isopropyl beta-D- 
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thiogalactopyranoside (IPTG). Expression of NAAP in eukaryotic cells is achieved by infecting 
insect or mammaliaa cell lines with recombinant Autographica califomica nuclear polyhedrosis virus 
(AcMNPV), commonly known as baculovirus. The nonessential polyhedrin gene of baculovirus is 
replaced with cDNA encoding NAAP by either homologous recombination or bacterial-mediated 
5 transposition involving transfer plasmid intermediates. Viral infectivity is maintained and the strong 
polyhedrin promoter drives high levels of cDNA transcription. Recombinant baculovirus is used to 
infect Spodopterafrugiperda (Sf9) insect cells in most cases, or human hepatocytes, in some cases. 
Infection of the latter requires additional genetic modifications to baculovirus (Engelhard, E.K. et al. 
(1994) Proc. Natl. Acad. Sci. USA 91:3224-3227; Sandig, V. et al. (1996) Hum. Gene Ther. 7:1937- 
10 1945). 

In most expression systems, NAAP is synthesized as a fusion protein with, e.g., glutathione 
S-transferase (GST) or a peptide epitope tag, such as FLAG or 6-His, permitting rapid, single-step, 
affinity-based purification of recombinant fusion protein from crude cell lysates. GST, a 26- 
kilodalton enzyme firom Schistosoma japonicum, enables the purification of fusion proteins on 

15 immobilized glutathione under conditions that maintain protein activity and antigenicity (Amersham 
Biosciences). Following purification, the GST moiety can be proteolytically cleaved from NAAP at 
specifically engineered sites. FLAG, an 8-amino acid peptide, enables inmiunoaffinity purification 
using commercially available monoclonal and polyclonal anti-FLAG antibodies (Eastman Kodak). 6- 
His, a stretch of six consecutive histidine residues, enables purification on metal-chelate resins 

20 (QIAGEN). Methods for protein expression and purification are discussed in Ausubel et al. {supra, 
ch. 10 and 16). Purified NAAP obtained by these methods can be used directly in the assays shown 
in Examples XVU, XVIBL, and XIX, where applicable. 
XIV. Functional Assays 

NAAP function is assessed by expressing the sequences encoding NAAP at physiologically 

25 elevated levels in naammalian cell culture systems, cDNA is subcloned into a mammalian expression 
vector containing a strong promoter that drives high levels of cDNA expression. Vectors of choice 
include PCMV SPORT plasmid (Invitrogen, Carlsbad CA) and PCR3.1 plasmid (Invitrogen), both of 
which contain the cytomegalovirus promoter. 5-10 /^g of recombinant vector are transiently 
transfected into a human cell line, for example, an endothelial or hematopoietic cell line, using either 

30 liposome fomiulations or electroporation. 1-2 /ug of an additional plasmid containing sequences 
encoding a marker protein are co-transfected. Expression of a marker protein provides a means to 
distinguish transfected cells from nontransfected cells and is a reliable predictor of cDNA expression 
from the recombinant vector. Marker proteins of choice include, e.g.. Green Fluorescent Protein 
(GFP; Clontech), CD64, or a CD64-GFP fusion protein. Flow cytometry (FCM), an automated, laser 

35 optics-based technique, is used to identify transfected cells expressing GFP or CD64-GFP and to 
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evaluate the apoptotic state of the cells and other cellular properties. FCM detects and quantifies the 
uptake of fluorescent molecules that diagnose events preceding or coincident with cell death. These 
events include changes in nuclear DNA content as measured by staining of DNA with propidium 
iodide; changes in cell size and granularity as measured by forward light scatter and 90 degree side 

5 light scatter; down-regulation of DNA synthesis as measured by decrease in bromodeoxyuridine 
uptake; alterations in expression of cell surface and intracellular proteins as measured by reactivity 
with specific antibodies; and alterations in plasma membrane composition as measured by the binding 
of fluorescein-conjugated Annexin V protein to the cell surface. Methods in flow cytometry are 
discussed in Ormerod, M.G. (1994; Flow Cvtometrv , Oxford, New York NY). 

10 The influence of NAAP on gene expression can be assessed using highly purified populations 

of cells transfected with sequences encoding NAAP and either CD64 or CD64-GFP. CD64 and 
CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions of human 
immunoglobulin G (IgG). Transfected cells are efficiently separated from nontransfected cells using 
magnetic beads coated with either himian IgG or antibody against CD64 (DYNAL, Lake Success 

15 NY). mRNA can be purified from the cells using methods well known by those of skill in the art. 
Expression of mRNA encoding NAAP and other genes of interest can be analyzed by northem 
analysis or microarray techniques. 

XV. Production of NAAP Specific Antibodies 

NAAP substantially purified using polyacrylamide gel electrophoresis (PAGE; see, e.g., 
20 Harrington, M.G. (1990) Methods Enzymol. 182:488-495), or other purification techniques, is used to 
immunize animals (e.g., rabbits, mice, etc.) and to produce antibodies using standard protocols. 

Alternatively, the NAAP amino acid sequence is analyzed using LASERGENE software 
(DNASTAR) to determine regions of high immunogenicity , and a corresponding oligopeptide is 
synthesized and used to raise antibodies by means known to those of skill in the art. Methods for 
25 selection of appropriate epitopes, such as those near the C-terminus or in hydrophilic regions are well 
described in the art (Ausubel et al., supra^ ch. 11). 

Typically, oligopeptides of about 15 residues in length are synthesized using an ABI 431 A 
peptide synthesizer (Applied Biosystems) using FMOC chemistry and coupled to KLH (Sigma- 
Aldrich, St. Louis MO) by reaction with N-maleimidobenzoyl-N-hydroxysuccinimide ester (MBS) to 
30 increase immunogenicity (Ausubel et al., supra). Rabbits are immunized with the oligopeptide-KLH 
complex in complete Freund's adjuvant. Resulting antisera are tested for antipeptide and anti-NAAP 
activity by, for example, binding the peptide or NAAP to a substrate, blocking with 1% BSA, reacting 
with rabbit antisera, washing, and reacting with radio-iodinated goat anti-rabbit IgG. 

XVI. Purification of Naturally Occurring NAAP Using Specific Antibodies 

35 Naturally occurring or recombinant NAAP is substantially purified by immunoaffinity 
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chromatography using antibodies specific for NAAP. An inmranoaffinity column is constructed by 
covalently coupling anti-NAAP antibody to an activated chromatographic resin, such as 
CNBr-activated SEPHAROSE (Amersham Biosciences). After the coupling, the resin is blocked and 
washed according to the noanufacturer's instructions. 
5 Media containing NAAP are passed over the immunoaffinity column, and the column is 

washed under conditions that allow the preferential absorbance of NAAP (e.g., high ionic strength 
buffers in the presence of detergent). The column is eluted under conditions that disrupt 
antibody/NAAP binding (e.g., a buffer of pH 2 to pH 3, or a high concentration of a chaotrope, such 
as urea or thiocyanate ion), and NAAP is collected. 

10 XVn, Identification of Molecules Which Interact with NAAP 

NAAP, or biologically active fragments thereof, are labeled with ^^^I Bolton-Hunter reagent 
(Bolton, A.E. and W.M. Hunter (1973) Biochem. J. 133:529-539). Candidate molecules previously 
arrayed in the wells of a multi-well plate are incubated with the labeled NAAP, washed, and any 
wells with labeled NAAP complex are assayed. Data obtained using different concentrations of 

15 NAAP are used to calculate values for the number, affinity, and association of NAAP with the 
candidate molecules. 

Alternatively, molecules interacting with NAAP are analyzed using the yeast two-hybrid 
system as described in Fields, S. and O. Song (1989; Nature 340:245-246), or using commercially 
available kits based on the two-hybrid system, such as the MATCHMAKER system (Clontech). 

20 NAAP may also be used in the PATHCALLING process (CuraGen Corp., New Haven CT) 

which employs the yeast two-hybrid system in a high-throughput manner to determine all interactions 
between the proteins encoded by two large libraries of genes (Nandabalan, K. et al. (2000) U.S. 
Patent No. 6,057,101). 
XVIII. Demonstration of NAAP Activity 

25 NAAP activity is measured by its ability to stimulate transcription of a reporter gene (Liu, 

H.Y. et al. (1997) EMBO J. 16:5289-5298). The assay entails the use of a well characterized reporter 
gene construct, LexAop-LacZ, that consists of LexA DNA transcriptional control elements (LexA^p) 
fused to sequences encoding the E. coli LacZ enzyme. The methods for constructing and expressing 
fusion genes, introducing them into cells, and measuring LacZ enzyme activity, are well known to 

30 those skilled in the art. Sequences encoding NAAP are cloned into a plasmid that directs the 

synthesis of a fusion protein, LexA-NAAP, consisting of NAAP and a DNA binding domain derived 
from the LexA transcription factor. The resulting plasmid, encoding a LexA-NAAP fusion protein, is 
introduced into yeast cells along with a plasmid containing the LexA^p-LacZ reporter gene. The 
amount of LacZ enzyme activity associated with LexA-NAAP transfected cells, relative to control 

35 cells, is proportional to the amount of transcription stimulated by the NAAP. 
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Alternatively, NAAP activity is measured by its ability to bind zinc. A 5-10 mM sample 
solution in 2.5 mM ammonium acetate solution at pH 7.4 is combined with 0.05 M zinc sulfate 
solution (Aldrich, Milwaukee WI) in the presence of 100 mM dithiothreitol with 10% methanol 
added. The sample and zinc sulfate solutions are allowed to incubate for 20 minutes. The reaction 
5 solution is passed through a VYDAC colunm (Grace Vydac, Hesperia, CA) with approximately 300 
Angstrom bore size and 5 voM particle size to isolate zinc-sample complex from the solution, and into 
a mass spectrometer (PE Sciex, Ontario, Canada). Zinc bound to sample is quantified using the 
functional atomic mass of 63.5 Da observed by Whittal, R.M. et al. ((2000) Biochemistry 39:8406- 
8417). 

10 In the alternative, a method to determine nucleic acid binding activity of NAAP involves a 

polyacrylamide gel mobility-shift assay. In preparation for this assay, NAAP is expressed by 
transforming a naammalian cell line such as COS7, HeLa or CHO with a eukaryotic expression vector 
containing NAAP cDNA. The cells are incubated for 48-72 hours after transformation under 
conditions appropriate for the cell line to allow expression and accumulation of NAAP. Extracts 

15 containing solubilized proteins can be prepared from cells expressing NAAP by methods well known 
in the art. Portions of the extract containing NAAP are added to p^P]-labeled RNA or DNA. 
Radioactive nucleic acid can be synthesized in vitro by techniques well known in the art. The 
mixtures are incubated at 25°C in the presence of RNase- and DNase-inhibitors under buffered 
conditions for 5-10 nodnutes. After incubation, the samples are analyzed by polyacrylamide gel 

20 electrophoresis followed by autoradiography. The presence of a band on the autoradiogram indicates 
the formation of a complex between NAAP and the radioactive transcript. A band of similar mobiUty 
will not be present in samples prepared using control extracts prepared from untransformed cells. 

In the alternative, a method to determine methylase activity of NAAP measures transfer of 
radiolabeled methyl groups between a donor substrate and an acceptor substrate. Reaction mixtures 

25 (50 /il final volume) contam 15 mM HEPES, pH 7.9, 1.5 mM MgClj, 10 mM dithiothreitol, 3% 
polyvinylalcohol, 1.5 fiCi [methyl'^BJAdoMet (0.375 fiM AdoMet) (DuPont-NEN), 0.6 fig NAAP, 
and acceptor substrate (e.g., 0.4 [ig p^SJRNA, or 6-mercaptopurine (6-MP) to 1 noM final 
concentration). Reaction mixtures are incubated at 30°C for 30 minutes, then 65°C for 5 minutes. 

Analysis of [methyl-^HiSNA is as follows: (1) 50 fi\ of 2 x loading buffer (20 mM Tris-HCl, 

30 pH 7.6, 1 M LiCl, 1 mM EDTA, 1% sodium dodecyl sulphate (SDS)) and 50 fil oligo d(T)-cellulose 
(10 mg/ml in 1 X loading buffer) are added to the reaction mixture, and incubated at ambient 
temperature with shaking for 30 nrinutes. (2) Reaction mixtures are transferred to a 96-well filtration 
plate attached to a vacuum apparatus. (3) Each sample is washed sequentially with three 2.4 nal 
aliquots of 1 x oligo d(T) loading buffer contaming 0.5% SDS, 0.1% SDS, or no SDS. (4) RNA is 

35 eluted with 300 fil of water into a 96-well collection plate, transferred to scintillation vials containing 
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liquid scintillant, and radioactivity determined. 

Analysis of [methyl-^B]6-MP is as follows: (1) 500 ill 0.5 M borate buffer, pH 10.0, and then 
2.5 ml of 20% (v/v) isoamyl alcohol in toluene are added to the reaction mixtures. (2) The samples 
are mixed by vigorous vortexing for ten seconds. (3) After centrifugation at 700g for 10 minutes, 1.5 
5 ml of the organic phase is transferred to scintillation vials containing 0.5 ml absolute ethanol and 
liquid scintillant, and radioactivity deteraained. (4) Results are corrected for the extraction of 6-MP 
into the organic phase (approximately 41%). 

In the alternative, type I topoisomerase activity of NAAP can be assayed based on the 
relaxation of a supercoiled DNA substrate. NAAP is incubated with its substrate in a buffer lacking 
10 Mg^^ and ATP, the reaction is terminated, and the products are loaded on an agarose gel. Altered 
topoisomers can be distinguished from supercoiled substrate electrophoretically. This assay is 
specific for type I topoisomerase activity because Mg^^ and ATP are necessary cofactors for type n 
topoisomerases . 

Type n topoisomerase activity of NAAP can be assayed based on the decatenation of a 

15 kinetoplast DNA (KDNA) substrate. NAAP is incubated with KDNA, the reaction is terminated, and 
the products are loaded on an agarose gel. Monomeric circular KDNA can be distinguished from 
catenated KDNA electrophoretically. Kits for measuring type I and type n topoisomerase activities 
are available commercially from Topogen (Colmnbus OH), 

ATP-dependent RNA helicase unwinding activity of NAAP can be measured by the method 

20 described by Zhang and Grosse (1994; Biochemistry 33:3906-3912). The substrate for RNA 

unwinding consists of ^^P-labeled RNA composed of two RNA strands of 194 and 130 nucleotides in 
length containing a duplex region of 17 base-pairs. The RNA substrate is incubated together with 
ATP, Mg^^, and varying amounts of NAAP in a Tris-HCl buffer, pH 7.5, at 37°C for 30 minutes. The 
single-stranded RNA product is then separated from the double-stranded RNA substrate by 

25 electrophoresis through a 10% SDS-polyacrylamide gel, and quantitated by autoradiography. The 
amount of single-stranded RNA recovered is proportional to the amount of NAAP in the preparation. 

Splicing activity of NAAP can be measured by the method of Hartmann, A.M. et al. (supra). 
Varying amounts of a construct containing NAAP, for example cloned into an expression vector such 
as PEGFP-C2 (Clontech), are transfected into HEK293 cells using the calcium phosphate method as 

30 described. RNA is isolated 17-24 hours after the transfection using the RNEASY mini kit 

(QIAGEN). Isolated RNA is mixed with antisense primer and dNTP and subjected to reaction with 
reverse transcriptase. Products of the reverse transcriptase reaction are amplified by PGR and 
analyzed on a 2% agarose Tris borate-EDTA gel. 

In the alternative, NAAP function is assessed by expressing the sequences encoding NAAP at 

35 physiologically elevated levels in mammalian cell culture systems. cDNA is subcloned into a 
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mammalian expression vector containing a strong promoter that drives high levels of cDNA 
expression. Vectors of choice include pCMV SPORT (Life Technologies) and pCRS.l (hivitrogen 
Corporation, Carlsbad CA), both of which contain the cytomegalovirus promoter. 5-10 jUg of 
recombinant vector are transiently transfected into a human cell line, preferably of endothelial or 
5 henaatopoietic origin, using either liposome formulations or electroporation. 1-2 fig of an additional 
plasmid containing sequences encoding a marker protein are co-transfected. 

Expression of a marker protem provides a means to distinguish transfected cells from 
nontransfected cells and is a reliable predictor of cDNA expression from the recombinant vector. 
Marker proteins of choice include, e.g.. Green Fluorescent Protein (GFP; CLONTECH), CD64, or a 

10 CD64-GFP fusion protein. Flow cytometry (FCM), an automated laser optics-based technique, is 
used to identify transfected cells expressing GFP or CD64-GFP and to evaluate the apoptotic state of 
the cells and other cellular properties. 

FCM detects and quantifies the uptake of fluorescent molecules that diagnose events 
preceding or coincident with cell death. These events include changes in nuclear DNA content as 

15 measured by staining of DNA with propidium iodide; changes in cell size and granularity as measured 
by forward light scatter and 90 degree side light scatter; down-regulation of DNA synthesis as 
measured by decrease in bromodeoxyuridine uptake; alterations in expression of cell surface and 
intracellular proteins as measured by reactivity with specific antibodies; and alterations in plasma 
membrane composition as measured by the binding of fluorescein-conjugated Annexin V protein to 

20 the cell surface. Methods in flow cytometry are discussed ia Qrmerod, M. G. (1994) Flow 
Cytometry, Oxford, New York NY. 

The influence of NAAP on gene expression can be assessed using highly purified populations 
of cells transfected with sequences encoding NAAP and either CD64 or CD64-GFP. CD64 and 
CD64-GFP are expressed on the surface of transfected cells and bind to conserved regions of human 

25 inomunoglobulin G (IgG). Transfected cells are efficiently separated from nontransfected cells using 
magnetic beads coated with either human IgG or antibody against CD64 (DYNAL, Inc., Lake Success 
NY). mRNA can be purified from the cells using methods well known by those of skill in the art. 
Expression of mRNA encoding NAAP and other genes of interest can be analyzed by northern 
analysis or microarray techniques. 

30 Pseudouridine ssmthase activity of NAAP is assayed using a tritium (^H) release assay 

modified from Nurse et al. ((1995) RNA 1:102-112), which measures the release of from the C5 
position of the pyrimidine component of uridylate (U) when ^H-radiolabeled U in RNA is isomerized 
to pseudouridine (y). A typical 500 ^^1 assay mixture contains 50 mM HEPES buffer (pH 7.5), 100 
mM ammonium acetate, 5 mM dithiothreitol, 1 mM EDTA, 30 units RNase inhibitor, and 0.1-4.2 [jlM 

35 [5-^H]tRNA (approximately 1 /iCi/nmol tRNA). The reaction is initiated by the addition of <5 [il of a 



102 



wo 03/006618 



PCT/US02/21971 



concentrated solution of NAAP (or sample containing NAAP) and incubated for 5 min at 37 °C. 
Portions of the reaction mixture are removed at various times (up to 30 min) following the addition of 
NAAP and quenched by dilution into 1 ml 0.1 M HCl containing Norit-SA3 (12% w/v). The 
quenched reaction mixtures are centrifuged for 5 min at maximum speed in a microcentrifuge, and the 
5 supematants are filtered through a plug of glass wool. The pellet is washed twice by resuspension in 

1 ml 0.1 M HCl, followed by centrifagation. The supematants from the washes are separately passed 
through the glass wool plug and combined with the original filtrate. A portion of the combined 
filtrate is tnixed with scintillation fluid (up to 10 ml) and counted using a scintillation counter. The 
amount of released from the RNA and present in the soluble filtrate is proportional to the amount 

10 of peudouridine synthase activity in the sample (Ramamurthy, V. (1999) J. Biol. Chem. 
274:22225-22230). 

In the alternative, pseudouridine synthase activity of NAAP is assayed at 30 to 37 ""C in a 
mixture containing 100 mM Tris-HCl (pH 8.0), 100 mM ammbnium acetate, 5 mM MgClj, 2 mM 
dithiothreitol, 0.1 mM EDTA, and 1-2 fmol of pP]-radiolabeled runoff transcripts (generated in vitro 

15 by an appropriate RNA polymerase, i.e., T7 or SP6) as substrates. NAAP is added to initiate the 

reaction or omitted from the reaction in control samples. Following incubation, the RNA is extracted 
with phenol-chloroform, precipitated in ethanol, and hydrolyzed completely to 3-nucleotide 
monophosphates using RNase T2. The hydrolysates are analyzed by two-dimensional thin layer 
chromatography, and the amount of ^^P radiolabel present in the yMP and UMP spots are evaluated 

20 after exposing the thin layer chromatography plates to film or a Phosphorlmager screen. Taking into 
account the relative number of uridylate residues in the substrate RNA, the relative amount yMP and 
UMP are determined and used to calculate the relative amount of y per tRNA molecule (expressed in 
mol y /mol of tRNA or mol y /mol of tRNA/minute), which corresponds to the amount of 
pseudouridine synthase activity in the NAAP sample (Lecointe, supra). 

25 N^,N^-dimethylguanosine transferase ((m\G)methyltransferase) activity of NAAP is 

measured in a 160 fi\ reaction mixture containing 100 mM Tris-HCl (pH 7.5), 0. 1 mM EDTA, 10 mM 
MgCl2, 20 mM NH4CI, ImM dithiothreitol, 6.2 jLiM 5'-adenosyl-L-[m^r/z3;Z-^H]methionine (30-70 
Ci/mM), 8 fig m^G-deficient tRNA or wild type tRNA from yeast, and approximately 100 jiig of 
purified NAAP or a sample comprising NAAP. The reactions are incubated at 30 °C for 90 min and 

30 chilled on ice. A portion of each reaction is diluted to 1 ml in water containing 100 ptg BSA. 1 ml of 

2 M HCl is added to each sample and the acid insoluble products are allowed to precipitate on ice for 
20 min before being collected by filtration through glass fiber filters. The collected material is 
washed several times with HCl and quantitated using a liquid scintillation counter. The amount of ^H 
incorporated into the m"2G-deficient, acid-insoluble tRNAs is proportional to the amount of 

35 N^,N^-dimethylguanosine transferase activity in the NAAP sample. Reactions comprising no 
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substrate tRNAs, or wild-type tRNAs that have already been modified, serve as control reactions 
which should not yield acid-insoluble ^H-labeled products. 

Polyadenylation activity of NAAP is measured using an in vitro polyadenylation reaction. 
The reaction mixture is assembled on ice and comprises 10 [il of 5 mM dithiothreitol, 0.025% (v/v) 
5 NONIDET P-40, 50 niM creatine phosphate, 6.5% (w/v) polyvinyl alcohol, 0.5 unityj^il RNAGUARD 
(Pharmacia), 0.025 /ig/jLcl creatine kinase, 1.25 mM cordycepin 5'-triphosphate, and 3.75 mM MgCU, 
in a total volume of 25 fih 60 finol of CstF, 50 fmol of CPSF, 240 fmol of PAP, 4 /il of crude or 
partially purified CF n and various amounts of amounts CF I are then added to the reaction mix. The 
volume is adjusted to 23.5 [il with a buffer containing 50 mM TrisHCl, pH 7.9, 10% (v/v) glycerol, 

10 and 0. 1 mM Na-EDTA. The final aniimonium sulfate concentration should be below 20 mM. The 
reaction is initiated (on ice) by the addition of 15 fiaaol of ^^P-labeled pre-mRNA template, along with 
2.5 fig of unlabeled tRNA, in 1.5 ju,! of water. Reactions are then incubated at 30 °C for 75-90 min 
and stopped by the addition of 75 /xl (approximately two-volumes) of proteinase K mix (0.2 M Tris- 
HCl, pH 7.9, 300 mM NaCl, 25 mM Na-EDTA, 2% (w/v) SDS), 1 /il of 10 mg/ml proteinase K, 0.25 

15 [JlI of 20 mg/ml glycogen, and 23.75 (il of water). Following incubation, the RNA is precipitated with 
ethanol and analyzed on a 6% (w/v) polyacrylamide, 8.3 M urea sequencing gel. The dried gel is 
developed by autoradiography or using a phosphoimager. Cleavage activity is determined by 
comparing the amount of cleavage product to the amount of pre-mRNA template. The omission of 
any of the polypeptide components of the reaction and substitution of NAAP is useful for identifying 

20 the specific biological function of NAAP in pre-mRNA polyadenylation (RUegsegger, supra; and 
references within). 

tRNA synthetase activity is measured as the aminoacylation of a substrate tElNA in the 
presence of ['"^C] -labeled amino acid. NAAP is incubated with [^'^C] -labeled amino acid and the 
appropriate cognate tRNA (for example, [^"^CJalanine and tRNA*^*) in a buffered solution. ^'^C- 

25 labeled product is separated from free [^"^CJamino acid by chromatography, and the incorporated 
is quantified by scintillation coimter. The amount of ^"^C-labeled product detected is proportional to 
the activity of NAAP in this assay. 

In the alternative, NAAP activity is measured by incubating a sample containing NAAP in a 
solution containing 1 mM ATP, 5 noM Hepes-KOH (pH 7.0), 2.5 mM KCl, L5 mM magnesium 

30 chloride, and 0.5 mM DTT along with misacylated [^*C]-Glu-tRNAGln (e.g., 1 juM) and a similar 
concentration of unlabeled L-glutancdne. Following the quenching of the reaction with 3 M sodium 
acetate (pH 5.0), the mixture is extracted with an equal volume of water-saturated phenol, and the 
aqueous and organic phases are separated by centrifugation at 15,000 x g at room temperature for 1 
min. The aqueous phase is removed and precipitated with 3 volumes of ethanol at -70°C for 15 min. 

35 The precipitated aminoacyl-tRNAs are recovered by centrifugation at 15,000 x g at 4°C forl5 min. 
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The pellet is resuspended in of 25 mM KOH, deacylated at 65°C for 10 rain., neutralized with 0.1 M 
HCl (to final pH 6-7), and dried under vacuum. The dried pellet is resuspended in water and spotted 
onto a cellulose TLC plate. The plate is developed in either isopropanol/formic acid/water or 
ammonia/water/chloroform/ noiethanoL The image is subjected to densitometric analysis and the 
5 relative amounts of Glu and Gin are calculated based on the Rf values and relative intensities of the 
spots. NAAP activity is calculated based on the amount of Gin resulting from the transformation of 
Glu while acylated as Glu-tRNA°*" (adapted from Cumow, A.W. et aL (1997) Proc. Natl. Acad. Sci. 
USA 94:11819-26). 

An alternative experiment for NAAP activity involves binding of DNA-bound KAP-l-RBBC 

10 protein, a corepressor for KRAB domain proteins, directly to the KRAB domain. Following 

preparation of plasmids and protein purification (Peng, H., et al. (2000) J. Biol. Chem. 275:18000- 
18010), an electrophilic mobility shift assay (EMS A) can be performed in which purified 
recombinant GAL4-KEIAB protein is incubated with purified Escherichia coli- or baculovirus- 
expressed KAP-l-RBBC protein for 15 min at 30°C. The KRAB protein is then added to the reaction 

15 simultaneously with the GAL4-KRAB and KAP-l-RBBC proteins, or the KRAB protein can be pre- 
incubated with the KAP-l-RBBC protein for 15 mhoi at 30°C. One ml of ^^P4abeled GAL4 probe (10^ 
cpm/ml) is then added, and the reaction incubated for an additional 15 min at 30°C. The DNA- 
protein complexes are then resolved on native polyacrylamide gels by electrophoresis in 45 mM Tris 
borate, pH 8.3, 1 mM EDTA buffer at 4°C. The EMSA gels are dried and visualized by 

20 autoradiography. Binding of the GAI>1.-KRAB protein complex to a standard ^^P-labeled GAL4 
oligonucleotide recognition sequence is demonstration of a mobility shift, and indicative of KRAB 
domain binding via direct interaction between the KRAB domain and KAP-1 protein. 

NAAP activity can be demonstrated by the use of in vitro translation assays which utilize 
mutant strains of S. cervisiae lacking the FUN12 gene which encodes yeast translation initiation 

25 factor 2 (IF2). These strains exhibit a slow growth phenotype which can be rescued (made to grow at 
a normal rate) by the addition of JFZ, including heterologous IF2 which is produced by recombinant 
methods. Briefly, the jfkn 12a strain J133 is transformed with either the low copy-number FUN 12 
plasnoid pC479, an expression plasmid carrying NAAP, or the vector only. The control strains and 
the test strains are streaked on synthetic minimal medium containing 10% galactose plus the required 

30 nutrient supplements, and the plates are incubated at 30 °C for 5 days. In vitro translation extracts are 
prepared from Xh&fim 12a strain J 133. Extracts are incubated with 200 ng of luciferase mRNA and 
increasing amounts of the control strains or the test strains containing a source of IF2. Luminescence 
of the samples is plotted as a function of the amount of test protein added to the translation reaction. 
The amount of luminescence corresponds to the amount of NAAP activity in the sample (Lee et al. 

35 supray 
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XIX. Identification of NAAP Agonists and Antagonists 

Agonists or antagonists of NAAP activation or inhibition may be tested using the assays 
described in section XVm. Agonists cause an increase in NAAP activity and antagonists cause a 
decrease in NAAP activity. 
5 Various modifications and variations of die described compositions, methods, and systems of 

the invention will be apparent to those skilled in the art without departing from the scope and spirit of 
the invention. It will be appreciated that the invention provides novel and useful proteins, and their 
encoding polynucleotides, which can be used in the drug discovery process, as well as methods for 
using these compositions for the detection, diagnosis, and treatment of diseases and conditions. 

10 Although the invention has been described in connection with certain embodiments, it should be 

understood that the invention as claimed should not be unduly limited to such specific embodiments. 
Nor should the description of such embodiments be considered exhaustive or limit the invention to 
the precise forms disclosed. Furthermore, elements from one embodiment can be readily recombined 
with elements firom one or more other embodiments. Such combinations can form a number of 

15 embodiments witliin the scope of the invention. It is intended that the scope of the ravention be 
defined by the following claims and their equivalents. 
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What is claimed is: 

1. An isolated polypeptide selected from the group consisting of: 

a) a polypeptide comprising an amino acid sequence selected from the group consisting 
of SEQ ID NO: 1-35, 

b) a polypeptide comprising a naturally occurring amino acid sequence at least 90% 
identical to an amino acid sequence selected from the group consisting of SEQ ID 
NO: 1-2, SEQ ID NO:4-7, SEQ ID NO:9-16, SEQ ID NO: 18-19, SEQ ID NO:21-22, 
SEQ ID NO:24, SEQ ID NO:27-35, 

c) a polypeptide comprising a naturally occurring amino acid sequence at least 97% 
identical to an amino acid sequence selected from the group consistiag of SEQ ID 
NO:3, SEQ ID NO: 17, and SEQ ID NO:25, 

d) a polypeptide comprising a naturally occurring amino acid sequence at least 98% 
identical to the amino acid sequence of SEQ ID NO: 8, 

e) a polypeptide comprising a naturally occurring amino acid sequence at least 95% 
identical to the amino acid sequence of SEQ ID NO:20, 

f) a polypeptide comprising a naturally occurring amino acid sequence at least 94% 
identical to the amino acid sequence of SEQ ID NO:23, 

g) a polypeptide comprising a naturally occurring anoino acid sequence at least 91 % 
identical to the anaino acid sequence of SEQ ID NO:26, 

h) a biologically active fragment of a polypeptide having an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-35, and 

i) an immunogenic fragment of a polypeptide having an amino acid sequence selected 
from the group consisting of SEQ ID NO: 1-35. 

2. An isolated polypeptide of claim 1 comprising an amino acid sequence selected from the 
group consisting of SEQ ID NO: 1-35. 

3. An isolated polynucleotide encoding a polypeptide of claim 1. 

4. An isolated polynucleotide encoding a polypeptide of claim 2. 

5. An isolated polynucleotide of claim 4 comprising a polynucleotide sequence selected from 
the group consisting of SEQ ID NO: 36-70. 
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6. A recombinant polynucleotide comprising a promoter sequence operably linked to a 
polynucleotide of claim 3. 

7. A cell transformed with a recombinant polynucleotide of claim 6. 

8. A transgenic organism comprising a recombinant polynucleotide of claim 6. 

9. A metiaod of producing a polypeptide of claim 1, the method comprising: 

a) culturing a cell under conditions suitable for expression of the polypeptide, wherein 
said cell is transformed with a recombinant polynucleotide, and said recombinant 
polynucleotide comprises a promoter sequence operably linked to a polynucleotide 
encoding the polypeptide of claim 1, and 

b) recovering the polypeptide so expressed. 

10. A method of claim 9, wherein the polypeptide comprises an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-35. 

11. An isolated antibody which specifically binds to a polypeptide of claim 1. 

12. An isolated polynucleotide selected from the group consisting of: 

a) a polynucleotide comprising a polynucleotide sequence selected from the group 
consisting of SEQ ID NO:36-70, 

b) a polynucleotide comprising a naturally occurring polynucleotide sequence at least 
90% identical to a polynucleotide sequence selected from the group consisting of 
SEQIDNO:36-70, 

c) a polynucleotide complementary to a polynucleotide of a), 

d) a polynucleotide complementary to a polynucleotide of b), and 

e) an RNA equivalent of a)-d). 

13. An isolated polynucleotide comprising at least 60 contiguous nucleotides of a 
polynucleotide of claim 12. 

14. A method of detecting a target polynucleotide in a sample, said target polynucleotide 
having a sequence of a polynucleotide of claim 12, the method comprising: 

a) hybridizing the sample with a probe comprising at least 20 contiguous nucleotides 
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comprising a sequence complementary to said target polynucleotide in the sample, 
and which probe specifically hybridizes to said target polynucleotide, under 
conditions whereby a hybridization complex is formed between said probe and said 
target polynucleotide or fragments thereof, and 
b) detecting the presence or absence of said hybridization complex, and, optionally, if 
present, the amount thereof. 

15. A method of claim 14, wherein the probe comprises at least 60 contiguous nucleotides. 

16. A method of detecting a target polynucleotide in a sample, said target polynucleotide 
haviug a sequence of a polynucleotide of claim 12, the method comprising: 

a) amplifying said target polynucleotide or fragment thereof using polymerase chain 
reaction ampUfication, and 

b) detecting the presence or absence of said amplified target polynucleotide or fragment 
thereof, and, optionally, if present, the amotmt thereof. 

17. A composition comprising a polypeptide of claim 1 and a pharmaceutically acceptable 
excipient. 

18. A composition of claim 17, wherein the polj^eptide comprises an anaino acid sequence 
selected from the group consisting of SEQ ID NO: 1-35. 

19. A method for treating a disease or condition associated with decreased expression of 
functional NAAP, comprising administering to a patient in need of such treatment the composition of 
claim 17. 

20. A method of screening a compound for effectiveness as an agonist of a polypeptide of 
claim 1, the method comprising: 

a) exposing a sample comprising a polypeptide of claim 1 to a compound, and 

b) detecting agonist activity in the sample. 

21. A composition comprising an agonist compound identified by a method of claim 20 and a 
pharmaceutically acceptable excipient. 

22. A method for treating a disease or condition associated with decreased expression of 
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functional NAAP, comprising administering to a patient in need of such treatment a composition of 
claim 21. 

23. A method of screening a compound for effectiveness as an antagonist of a polypeptide of 
claim 1, the method comprising: 

a) exposing a sample comprising a polypeptide of claim I to a compound, and 

b) detecting antagonist activity in the sample. 

24. A composition comprising an antagonist compound identified by a method of claim 23 
and a pharmaceutically acceptable excipient. 

25. A method for treating a disease or condition associated with overexpression of functional 
NAAP, comprising adnoinistering to a patient in need of such treatment a composition of claim 24. 

26. A method of screening for a compound that specifically binds to the polypeptide of claim 
1, the method comprising: 

a) combining the polypeptide of claim 1 with at least one test compound under suitable 
conditions, and 

b) detecting binding of the polypeptide of claim 1 to the test compoimd, thereby 
identifying a compound that specifically binds to the polypeptide of claim 1. 

27. A method of screening for a compound that modulates the activity of the polypeptide of 
claim 1, the method comprising: 

a) combining the polypeptide of claim 1 with at least one test compound under 
conditions permissive for the activity of the polypeptide of claim 1, 

b) assessing the activity of the polypeptide of claim 1 in the presence of the test 
compound, and 

c) comparing the activity of the polypeptide of claim 1 in the presence of the test 
compound with the activity of the polypeptide of claim 1 in the absence of the test 
compound, wherein a change in the activity of the polypeptide of claim 1 in the 
presence of the test compound is indicative of a compound that modulates the activity 
of the polypeptide of claim 1. 

28. A method of screening a compound for effectiveness in altering expression of a target 
polynucleotide, wherein said target polynucleotide comprises a sequence of claim 5, the method 
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comprising: 

a) exposing a sample comprising the target polynucleotide to a compound, under 
conditions suitable for the expression of the target polynucleotide, 

b) detecting altered expression of the target polynucleotide, and 

5 c) comparing the expression of the target polynucleotide in the presence of varying 

amounts of the compound and in the absence of the compound. 

29. A method of assessing toxicity of a test compound, the method comprising: 

a) treating a biological sample containing nucleic acids with the test compound, 
10 b) hybridizing the nucleic acids of the treated biological sample with a probe comprising 

at least 20 contiguous nucleotides of a polynucleotide of claim 12 under conditions 
whereby a specific hybridization complex is formed between said probe and a target 
polynucleotide in the biological sample, said target polynucleotide comprising a 
polynucleotide sequence of a polynucleotide of claim 12 or fragment thereof, 
15 c) quantifying the amount of hybridization complex, and 

d) comparing the amount of hybridization complex in the treated biological sample with 
the amount of hybridization complex in an untreated biological sample, wherein a 
difference in the amount of hybridization complex in the treated biological sample is 
indicative of toxicity of the test compound. 

20 

30. A diagnostic test for a condition or disease associated with the expression of NAAP in a 
biological sample, the method comprising: 

a) combining the biological sample with an antibody of claim 11, under conditions 
suitable for the antibody to bind the polypeptide and form an antibody:polypeptide 

25 complex, and 

b) detecting the complex, wherein the presence of the complex correlates with the 
presence of the polypeptide in the biological sample. 

31. The antibody of claim 11, wherein the antibody is: 
30 a) a chimeric antibody, 

b) a single chain antibody, 

c) a Fab fragment, 

d) a F(ab')2 fragment, or 

e) a humanized antibody. 

35 
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32. A composition comprising an antibody of claim 11 and an acceptable excipient. 

33. A method of diagnosing a condition or disease associated with the expression of NAAP 
in a subject, comprising administering to said subject an effective amount of the composition of claim 
32. 

34. A composition of claim 32, wherein the antibody is labeled. 

35. A method of diagnosing a condition or disease associated with the expression of NAAP 
in a subject, comprising administering to said subject an effective amount of the composition of claim 
34. 

36. A method of preparing a polyclonal antibody with the specificity of the antibody of claim 
11, the method comprising: 

a) immunizing an animal with a polypeptide consisting of an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-35, or an immunogenic fragment 
thereof, under conditions to elicit an antibody response, 

b) isolating antibodies from the animal, and 

c) screening the isolated antibodies with the polypeptide, thereby identifying a 
polyclonal antibody which specifically binds to a polypeptide comprising an amino 
acid sequence selected from the group consisting of SEQ ID NO: 1-35. 

37. A polyclonal antibody produced by a method of claim 36. 

38. A composition comprising the polyclonal antibody of claim 37 and a suitable carrier. 

39. A method of making a monoclonal antibody with the specificity of the antibody of claim 
11, the method comprising: 

a) immunizing an animal with a polypeptide consisting of an amino acid sequence 
selected from the group consisting of SEQ ID NO: 1-35, or an immunogenic fragment 
thereof, under conditions to elicit an antibody response, 

b) isolating antibody producing cells from the animal, 

c) fusing the antibody producing cells with inamortalized cells to form monoclonal 
antibody-producing hybridoma cells, 

d) culturing the hybridoma cells, and 
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e) isolating jfrom the culture monoclonal antibody which specifically binds to a 

polypeptide comprising an amino acid sequence selected from the group consisting of 
SEQIDNO:l-35. 

5 40. A monoclonal antibody produced by a method of claim 39. 

41. A composition comprising the monoclonal antibody of claim 40 and a suitable carrier. 

42. The antibody of claim 11, wherein the antibody is produced by screening a Fab 
10 expression library. 

43. The antibody of claim 11, wherein the antibody is produced by screening a recombinant 
immunoglobulin library. 

15 44. A method of detecting a polypeptide comprising an amino acid sequence selected from 

the group consisting of SEQ ID NO: 1-35 in a sample, the method comprising: 

a) incubating the antibody of claim 1 1 with a sample under conditions to allow specific 
binding of the antibody and the polypeptide, and 

b) detecting specific binding, wherein specific binding indicates the presence of a 

20 polypeptide comprising an amino acid sequence selected from the group consistiug of 

SEQ ID NO: 1-35 m the sample. 

45. A method of purifying a polypeptide comprising an amino acid sequence selected from 
the group consisting of SEQ ID NO: 1-35 from a sample, the method comprising: 

25 a) incubating the antibody of claim 1 1 with a sample under conditions to allow specific 

binding of the antibody and the polypeptide, and 
b) separating the antibody from the sample and obtaining the purified polypeptide 

comprising an amino acid sequence selected from the group consisting of SEQ ID 
NO: 1-35. 

30 

46. A microarray wherein at least one element of the microarray is a polynucleotide of claim 

13. 

47. A method of generating an expression profile of a sample which contains 
35 polynucleotides, the method comprising: 
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a) labeling the polynucleotides of the sample, 

b) contacting the elements of the microarray of claim 46 with the labeled 
polynucleotides of the sample under conditions suitable for the formation of a 
hybridization complex, and 

5 c) quantifying the expression of the polynucleotides in the sample. 



48. An array comprising different nucleotide molecules affixed in distinct physical locations 
on a solid substrate, wherein at least one of said nucleotide molecules comprises a first 
oligonucleotide or polynucleotide sequence specifically hybridizable with at least 30 contiguous 

10 nucleotides of a target polynucleotide, and wherein said target polynucleotide is a polynucleotide of 
claim 12. 

49. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to at least 30 contiguous nucleotides of said target polynucleotide, 

15 

50. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
completely complementary to at least 60 contiguous nucleotides of said target polynucleotide. 



51. An array of claim 48, wherein said first oligonucleotide or polynucleotide sequence is 
20 completely complementary to said target polynucleotide. 

52. An array of claim 48, which is a microarray. 

53. An array of claim 48, further comprising said target polynucleotide hybridized to a 
25 nucleotide molecule comprising said first oligonucleotide or polynucleotide sequence. 



54. An array of claim 48, wherein a linker joins at least one of said nucleotide molecules to 
said solid substrate. 



30 55. An array of claim 48, wherein each distinct physical location on the substrate contains 

multiple nucleotide molecules, and the multiple nucleotide molecules at any single distinct physical 
location have the same sequence, and each distinct physical location on the substrate contains 
nucleotide molecules having a sequence which differs fi:om the sequence of nucleotide molecules at 
another distinct physical location on the substrate. 

35 
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56. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:l. 

57. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:2. 
5 58. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:3. 

59. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:4. 

60. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:5. 

10 

61. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:6. 

62. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 7. 
15 63. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:8. 

64. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:9. 

65. A polj^eptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 10. 

20 

66. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 11. 

67. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 12. 
25 68. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 13. 

69. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 14. 

70. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 15. 

30 

71. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 16. 

72. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 17. 

35 73. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 18. 

177 
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74. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO: 19. 

75. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:20. 

76. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:21. 

77. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:22. 

78. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:23. 

79. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:24. 

80. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:25. 
15 81. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:26. 

82. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:27. 

83. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:28. 

20 

84. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:29. 

85. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:30. 
25 86. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:31. 

87. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:32. 

88. A polypeptide of claim 1, comprising the amino acid sequence of SEQ ID NO:33. 

30 

89. A polypeptide of claim 1, comprising the amino acid sequence of SEQ JD NO: 34. 

90. A polypeptide of claim 1, comprismg the amino acid sequence of SEQ ID NO:35. 

35 91. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 
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NO:36. 



92. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO: 37. 

5 

93. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:38. 

94. A poljmucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

10 NO:39. 

95. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:40. 

15 96. A polynucleotide of claim 12, con^rising the polynucleotide sequence of SEQ ID 

NO:41. 

97. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:42. 

20 

98. A polynucleotide of claim 12, conq>rising the polynucleotide sequence of SEQ ID 

NO:43. 

99. A polynucleotide of claim 12, conq>rising the polynucleotide sequence of SEQ ID 

25 NO:44. 

100. A polynucleotide of claim 12, con^rising the poljmucleotide sequence of SEQ ID 

NO:45. 

30 101. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:46. 

102. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:47. 

35' 
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103. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 



NO:48. 



104. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 



5 NO:49. 



105. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 



NO:50. 



10 106. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:51. 



107. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 



NO:52. 

15 

NO:53. 
20 NO:54. 



108. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 



109. A polynucleotide of claim 12, conq)rising the polynucleotide sequence of SEQ ID 



110. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 



NO:55. 



25 1 1 1. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:56. 



1 12. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 



NO:57. 

30 

NO:58. 
35 NO:59. 



1 13. A polynucleotide of claim 12, cona|>rising the polynucleotide sequence of SEQ ID 



1 14. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 
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115. A polynucleotide of claim 12, coniprising the polynucleotide sequence of SEQ ID 

NO:60. 

1 16. A polynucleotide of claim. 12, comprising the polynucleotide sequence of SEQ ID 

5 NO:61. 

117. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:62. 

10 118. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:63. 

119. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:64. 

15 

120. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:65. 

121. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

20 NO:66. 

122. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:67. 

25 123. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:68. 

124. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:69. 

30 

125. A polynucleotide of claim 12, comprising the polynucleotide sequence of SEQ ID 

NO:70. 
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Gly 






235 










240 


Ala Arg Gly Cys 


Thr 


Ala 


Thr 


Leu 






250 










255 


Leu 


Asp 


Ala 


Val 


Ser 


Asn 


He 


Tyr 






265 










270 


Phe 


Trp 


Lys 


Glu 


Thr 


Val 


Phe 


Thr 






280 










285 


Thr 


Asp 


His 


Leu 


Ala 


Lys 


Thr 


His 






295 










300 


Thr 


Gin 


Ala 


Pro 


Ala 


Val 


Thr 


Thr 






310 










315 



<210> 2 
<211> 192 
<212> PRT 

<213> Homo sapiens 
<220> 
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<221> misc__f eature 
<223> Incyte ID No: 

<400> 2 



Met 


Lys 


Ala 


Val 


Leu 


1 








5 


Val 


Asp 


He 


Thr 


Leu 










9 n 

^ yj 


Arg 


Gly 


Thr 


Leu 


on n 










3 5 


Gly Leu 


Leu Gly 


Lys 












Met 


Leu 


Arg Arg 


Trp 










o o 


His 


Ala 


Gin 


Lys 












ft n 

o u 


Lys 


Met 


Arg 


Ser 


Leu 












Gin 


Glu 


Asn 


Ser 


O C J- 










inn 


Lys 


Tyr 


He 


Cys 


Arg 










125 


He 


Ser 


Gin 


Ala 


Gin 










140 


He 


Glu 


Leu 


Val 


Ser 










155 


Thr 


Ala 


Lys 


Asn 


Gin 










170 


Val 


Ser 


Glu 


Lys 


Gly 










185 



<210> 3 
<211> 735 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> mi sc_f eature 
<223> Incyte ID No: 

<400> 3 



Met 


Ala 


Ala 


Asp 


Ser 


1 








5 


Leu 


Asp 


Asp 


He 


Leu 
20 


Tyr 


Asn 


Pro 


Glu 


Ser 
35 


Lys 


Arg 


Lys 


Ser 


Asp 
50 


Lys 


Pro 


Ser 


Val 


His 
65 


Ser 


Ser 


Val 


Ser 


Asn 
80 


Ser 


Ala 


Thr 


Glu 


Tyr 

95 


Asn 


Lys 


Arg 


Leu 


Asp 
110 


Ala 


Ser 


Arg 


Glu 


Pro 
125 


Arg 


Lys 


Arg 


Asp 


Pro 
140 


Gly 


Ser 


Glu 


Arg 


He 
155 



7990930CD1 



Ser 


Asn 


Gin 


Thr 


Val 
10 


Lys 


Gly 


His 


Thr 


Val 
25 


Arg 


Asp 


Phe 


Ser 


His 








40 


Lys 


Lys 


Gin 


Arg 


Leu 


Glu 


Leu 


Ala 


Ala 


Met 

/ \j 


He 


Lys 


Gly Val 


He 










ft R 


Cys 


Ser 


His 


Phe 


Pro 

inn 
X u u 


He 


Val 


Glu 


Val 


Arg 


Val 


Arg 


Val 


Ser 


Pro 
130 


Lys 


Asp 


Glu 


Phe 


He 
145 


Asn 


Ser 


Ala 


Ala 


Leu 
160 


Asp 


He 


Arg 


Thr 


Phe 
175 


Thr 


Val 


Gin 


Gin 


Ala 
190 



7037554CD1 



Arg 


Glu 


Glu 


Lys 


Asp 
10 


Thr 


Glu 


Val 


Pro 


Glu 
25 


Glu 


Gin 


Asp 


Lys 


Asn 
40 


Arg 


Met 


Glu 


Ser 


Thr 
55 


Ser 


Arg 


Gin 


Leu 


Val 
70 


Asn 


Lys 


Arg 


He 


Val 
85 


Lys 


Asn 


Glu 


Glu 


Tyr 
100 


Ala 


Asp 


Arg 


Lys 


He 

115 


Tyr 


Lys 


Asn 


Gin 


Pro 
130 


Glu 


Arg 


Arg 


Ala 


Lys 
145 


Gly 


Leu 


Glu 


Val 


Asp 
160 



3/7 



His 


He 


Pro 


Glu 


Asn 










15 


Leu 


Val 


Lys 


Glv 


Pro 










30 




Ser 


He 




Leu 










45 


Gin 


He Asp 


Lys 


Cys 










fin 


Arg 


Ala 


He 


Cys 


Ser 










/ Zj 


Leu 


Gly 


Phe 
















He 


Asn 


Val 


He 


Met 










1 OR 

X U J 


Thr 


Phe 


Leu 


Ser 


Glu 










120 


Gly Val 


Ala 


Cys 


Ser 










135 


Leu 


Lys 


Gly 


Asn 


Asp 










150 


Tnr 


vrxn 


ij-jLn 


Thr 


Thr 










165 


Leu 


Asp 


ta-JLy 


He 


Tyr 










180 


Asp 


vrrXU 








Gly 


Glu 


Leu 


Asn 


Val 












Gin 


Asp 


Asp 


Glu 


Leu 










30 


Glu 


Lys 


Lys 




OCX. 












Asp 


Thr 


Lys 




Gin 










fin 


Ser 


Lys 


Pro 














/ o 


Ser 


Thr 


Lys 


vjr JL^ 


Lys 










Q n 


Gin 


Arg 


Ser 


Glu 


Arg 










105 


Arg 


Leu 


Ser 


Ser 


Ser 










120 


Glu 


Lys 


Thr 


Cys 


Val 










135 


Ser 


Pro 


Thr 


Pro 


Asp 










150 


Arg Arg Ala 


Ser 


Arg 










165 
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OCX. 


Gin 


Ssr 


ocx 


Lys 


VjX U. 


oX U. 


V CLX 


7\ on 


OtSX 


Pin 


VjrX LL 


±yr 


Pl -^7- 










1 70 

X / VJ 










1 "7 R 
X / 3 










X 0 u 


O c: J_ 




XJ.X o 


JL Ll 


X J.XX 


vjxy 


OcX 


OcX 




OcX 


Gtsiv* 
OfciX 


Asp 


r^i n 

VjrX LL 


nl n 


nl -VT- 




















X J? u 










1 Q R 
X ^ D 




Asn 




Glu 


A cm 


VjX LL 


m n 

VjX LL 


m 11 

VjX LL 




VCLX 


vjtX LL 


rnl 11 

VjX LL 


Ato^ 


"^73 1 

V CLX 


nl 11 

VjX LL 




















9 flR 

^ VJ J 










91 n 

Z X u 


Glu 


Asp 


Glu 


Glu 


Val 


m 11 

vj-L LL 


Glu 


A CSTi 
rt.toj— ' 


Ala 


Glu 


Glu 


Asp 


Glu 


Glu 


Val 










91 R 

^ X ^ 










9 9 n 

z ^ u 










99 R 
Z Z D 




w J- U. 




vifx jf 


Glu 


VirX LX 


m 11 

\ir_L LL 


m 11 

ox LL 


V7X LL 


V^X LL 


nl 11 
wX LL 


nl 11 
V7X LL 


f^l n 

V3X LL 


al 11 

virX LL 


VjtX LL 










9 0 
^ o \j 










9 R 

^ O 3 










9 AO 
Z 4t U 


Glu 


Glu 


Glu 


Glu 


Glu 


VJX LL 


ai 11 

VjX LL 


m 11 

VjX LL 


al 11 
V7X LL 


xyx 


Cil 11 
VJT_L LL 


ni n 

V7X11 


A CST^ 


ni n 

wX LL 


AX y 










9 

^ O 










9 RO 










9 R R 
ZDD 




Gin 


XJ o 


m 11 

V2rX Li 


Glu 


vjx 


A cm 


A OY^ 




A csn» 


XXXX 




OcX 


ni 11 

VirXLL 


Al a 

AXCL 










9 n 
z o u 










9 R 
ZOO 










o *7 n 
z / u 


Cf O T~ 
OCX 




Cot" 
OCX 


vjxy 


OcX 


P"! 77 

(.^X LL 


Ocx 


VcLJ. 


OCJL 


iriic 


Txir 


Asp 




GOT^ 

OCX 


vax 










on ^ 










o Q n 

Z o U 










O Q R 
Z oD 




O tJJL 


VjXJy 


OcX 


VjxXJy 


X IIX 


rVSp 


VjX_Y 


Gov 
Ocx 


Asp 


PI n 


Lys 


T 

Lys 


Lys 


Pi n 










9 Q n 










9 Q R 










n n 
J u u 






ax y 


jHtXCl 


Arcf 


vjXy 


X Xc 


OcX 


Jtrx O 


Tl 
XXc 


\7j5i 1 


Jrilc 


Asp 


Arg 


Ser 










n R 
o u o 










o xu 










1 




OCX. 


OcX 


AT « 
raXcl 


OcX 




Gci V 
OcX 


Tyr 


Al a 


vjxy 


G^-K* 

OCX 


vjX u. 


Lys 


Lys 


XlXto 










o ^ u 










9 R 










*3 1 n 
o J U 




T 


Xjeu 


OCX 


OCX 


Ser 




Arg 


Axa 


V ax 


Arg 


Lys 


Asp 


Pi n 


Thr 










Q o c 
J -5 O 










"5 A n 

O 41 u 










/I 

-D 4tD 


Got" 


T .Tro 
Jjy CD 


Leu 


T 

ijys 


lyr 


V a.X 


T 

xjeu 


m -n 
LjXII 


Asp 


Ala 


Axg 


Jrnc 


TDVic* 

jrne 


Leu 


Tl o 

XX e 










n 

O u 










o c c: 
ODD 










"5 <^ o 
O O u 


Jjjr to 


Gear* 
OCX 


7\ on 


A an 


XXX o 


virX LL 


A cm 
Atoll 


V9.X 


Ocx 


Leu 


Al sa 


Lys 


Al » 


Lys 


Pl 










O O I? 










o / u 










O "7 C 
-J / D 


V CL JU 




OCX 


X IIX 


XJCLL 


JrX L-J 


V CLX 


jniSn 


m 11 
vjX LL 


T ii%rc! 
XJ^ to 


Lys 


XJCLL 


Asn 


Leu 


Al a 










O O w 










RR 










Q n 

J u 


Pile 


xix y 


. OCX 




i-ix y 


OCX 


V CLX 


X X C 


XlC LL 


Tl ^ 

XXC 


iriic 


Got" 
OCX 


T7p» 1 
V d X 


jHkX y 


PI n 
vnX LL 










Q R 










Ann 

41: U U 










/i n R 

4t U D 


o t; j_ 


Gly 


T 

xjy s 


jcriic 


{^1 n 

vuXli 


VjX^ 


irixc 




Arg 


Leu 


G<m^ 
OcX 


Gov* 
O cX 


vuX LL 


OcX 


XlXto 










^ X vj 










A1 R 
4t X D 










/ion 
4tZ u 


His 


vJ JL_y 


virXjf 


OCX 


irX 


X X c 


■Rn c; 
jnx to 


ixp 


V uX 


Leu 


XT X L/ 


Al ^ 
AXCL 


fil -vr 
VaXy 


X\Lc L. 


G o V 
Ocx 










A9 R 










*t J u 










/I c: 


x-lJLCt 


jjys 


IXLc L» 


XjcLL 






V CLX 


IrXic 


T 

Lys 


Tl *=» 

XXc 


Asp 


Trp 


Tl 

XXc 


Cys 


Arg 










*± r± \J 










A AR 
4t 4t D 










/I R n 




ai n 

VjX ll 


Xieu 


irx KJ 


Jriic 


XllX 


Lys 


OcxT 


Ala 


XIX to 


Leu 


X liX 


Asn 


Pro 


Trp 










A R R 
f± J J 










A ^ n 

4t O U 










4t D D 


noil 


v7X LL 


XxX o 


xiys 


JTX 


V a.x 


Lys 


Tl <=» 
XXc 


Vjxy 


Arg 


Asp 


rni \r 


PI n 


nl n 

oXU 


Tl o 

xxe 




















A7 R 










/on 

4t o U 


m 11 


XICLL 


VjX Ll 




VjX 


X IIX 


V3X1X 


UCLL 


to 


XJC LL 


Leu 


XrllC 


Jrx O 


Pro 


ASp 










ARR 










AQ n 










/I Q R 

^ly D 




OCX 


X X c 


Asp 


Leu 


Tyr 


VjXll 


V^X 


Tl ca 

xxe 


JrlX to 


Lys 


IXLc U 


Arg 


rlXto 


Lys 










R n n 










R OR 
DUO 










D J.U 




x^x y 


X*XC 


XXX O 


OCX 


V^Xll 


irx ^ 


Arg 


OCX 


A vrT 
AX y 


vjxj^ 


raX y 


Jrx O 


Gov* 
OcX 


Arg 










R1 R 

X J 










R9n 

o ^ u 










RO R 
D Z D 


x^l. y 




Jr X 


V dX 




7\ or~» 


V ctx 


oX_y 


Arg 


"X y 


Axy 


irx O 


PI 1 1 


Asp 


Tyr 










R n 

O \J 










R"^ R 










R / n 

D 41: U 




xxc 


nx o 


A cm 


OCX 


A 'Y~TT 


xj_y to 


Lys 


It X U 


AX y 


Tl o 
XXc 


A Qr% 


Tyr 


Pro 


Pro 










R AR 










RRO 
DDU 










R c: c 

ddd 


U. 


JrXxc 


Xlxo 


VjjXII 


Arg 


Pro 


Vjxy 


Tyr 


XlcU. 


Lys 


Asp 


Pro 


Arg 


Tyr 


r*i 1-1 










R fin 

J O L/ 










R fi R 

D O D 










R7 n 
D / u 




vctx 




Ser 




Tiir 


Asn 


Leu 


xxe 


Pro 


Asn 


Arg 


Arg 


fne 


Ser 










R T R 










R Q n 

D O U 










c: D c: 
D OD 


Gly 


Val 




Atq" 


Asp 


Val 


Pile 


Leu 


Asn 


Gly 


Ser 


a.yx 


A Gn 

AtolX 


Asp 


xyx 










590 










595 










600 


Val 


Arg 


Glu 


Phe 


His 


Asn 


Met 


Gly 


Pro 


Pro 


Pro 


Pro 


Trp 


Gin 


Gly 










605 










610 










615 


Met 


Pro 


Pro 


Tyr 


Pro 


Gly 


Met 


Glu 


Gin 


Pro 


Pro 


His 


His 


Pro 


Tyr 










620 










625 










630 


Tyr 


Gin 


His 


His 


Ala 


Pro 


Pro 


Pro 


Gin 


Ala 


His 


Pro 


Pro 


Tyr 


Ser 
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O J D 


Gly His 


His 


Pro 


Val 










n 


Val 


His 


Asp 


Tyr 


Asp 

O DO 


Gin 


Ala 


Val 


Val 


Ser 

o r\ 
D Q U 


Arg 


Glu 


Arg 


Glu 


Arg 
695 


Glu 


Arg 


Asp 


Arg 


Gly 
710 


Cys 


Asp 


Arg 


Asp 


Arg 
725 



<210> 4 

<211> 1340 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 
<223> Incyte ID No: 

<400> 4 



Met 


Leu 


Thr 


Leu 


Cys 


1 








5 


Asp 


Arg 


Val 


Ala 


Phe 
20 


Ser 


Leu 


Arg 


Val 


Pro 
35 


Arg 


lie 


Leu 


Arg 


Gin 
50 


Gin 


Gin 


Leu 


Arg 


Gin 
65 


Glu 


Leu 


Arg 


Leu 


Val 
80 


Ala 


lie 


Leu 


Leu 


Gin 
95 


lie 


Leu 


Ser 


Gin 


Met 
110 


Leu 


Asn 


Phe 


His 


Tyr 
125 


Ser 


Ser 


Glu 


Gin 


Arg 
140 


Arg 


Arg 


He 


Phe 


Cys 
155 


Gly 


lie 


Asn 


Leu 


Val 
170 


Asp 


Leu 


Asn 


Pro 


Val 
185 


Arg 


lie 


Gly 


Arg 


Cys 
200 


Gly 


Asn 


Ser 


He 


Glu 

215 


Leu 


lie 


Arg 


Glu 


Val 
230 


Phe 


Leu 


Thr 


Gin 


Arg 
245 


Pro 


Met 


Asp 


Asp 


Ala 
260 


Val 


Leu 


Ser 


Gin 


Glu 
275 











640 


Pro 


His 


Glu 


Ala 


Arg 
655 


Met 


Arg Val 


Asp Asp 










670 


Gly 


Arg 


Arg 


Ser 


Arg 

685 


Asp 


Arg 


Pro 


Arg 


Asp 
700 


Arg 


Asp 


Arg 


Glu 


Arg 
715 


Asp 


Arg 


Gly 


Glu 


Arg 
730 



1515347CD1 



Arg 


Cys 


Gly 


Glu 


Ser 
10 


Val 


He 


Pro 


Pro 


Val 
25 


Arg 


Pro 


Pro 


Pro 


Leu 
40 


Gly Leu 


Arg 


Glu 


His 










55 


Thr 


Thr 


Ala 


Pro 


Arg 
70 


Gin 


Phe 


Asp 


Ser 


Gly 
85 


Lys 


Leu 


Lys 


Ser 


Glu 

100 


He 


Leu 


Met 


Leu 


Asp 
115 


Leu 


Thr 


Tyr 


Val 


Arg 
130 


Gin 


Glu 


Leu 


Met 


Arg 
145 


Ala 


He 


Leu 


Ser 


Thr 
160 


Glu .Ala 


Asp 


Thr 


Val 










175 


Met 


Asp 


Ala 


Lys 


Ala 
190 


Lys 


Asp 


He 


His 


He 
205 


Glu 


Lys 


Leu 


Leu 


Lys 

220 


Ala 


Ala 


Gin 


Gly Asn 










235 


Thr 


He 


Gin 


Glu 


Leu 
250 


Gly 


Phe 


Pro 


Val 


Lys 
265 


Pro 


Ser 


val 


Thr 


Glu 
280 



5/78 



645 



Tyr 


Arg 


Asp 


Lys 


Arg 










DOU 


pne 


Leu 


Arg 


Arg 


Thr 










O / D 


Pro 


Arg 


Glu 


Arg 


Asp 










690 


Asn 


Arg 


Arg 


Asp 


Arg 










705 


Glu 


Arg 


Glu 


Arg 


Leu 










720 


Gly 


Arg 


Tyr 


Arg 


Arg 










735 


Leu 


Gin 


Asp 


Val 


~r1 ^ 

He 










15 


VaX 


AJLa 


AXa 


Pro 


Pro 










3 U 


Tyr 


Ser 


His 


Arg 


Met 












Ala 


"TV T 

Ala 


Pro 


Tyr 


Phe 










oO 


Leu 


Leu 


Gin 


Phe 


Pro 










75 


Lys 


Leu 


Glu 


TV n _ 

Ala 


Leu 










y u 


Gly 


Arg 


Arg 


Val 


Leu 










105 


He 


Leu 


Glu 


Met 


Phe 










120 


He 


Asp 


Glu 


Asn 


Ala 










"1 Q c 


Ser 


Phe 


Asn 


Arg 


Asp 










150 


His 


Ser 


Arg 


Thr 


Thr 










ICC 


Val 


Phe 


Tyr 


Asp 


Asn 










1 O A 


Gin 


Glu 


Trp 


Cys 


Asp 










195 


Tyr 


Arg 


Leu 


Val 


Ser 










210 


Asn 


Gly 


Thr 


Lys 


Asp 










225 


Asp 


Tyr 


Ser 


Met 


Ala 










240 


Phe 


Glu 


Val 


Tyr 


Ser 










255 


Ala 


Glu 


Glu 


Phe 


Val 










270 


Thr 


He 


Ala 


Pro 


Lys 










285 
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7\ n "^a 

A±a. 


Arg 


Pro 


pne 


-]- n ^ 

lie 


Glu 


7\ 1 =3 

Ala 


Leu 


Lys 


Ser 


lie 


(jlU 


Tyr 


Leu 










O Q A 










O Q R 

z y o 










"3 n n 
J u u 






Asp 


£\±Ci 


vjxn 


Lys 


Ser 


TV T =a 

AX a 


VjXil 


f^T 11 
VjXU 


vjiy 


vai 


Leu 


ijJLy 


Pro 










n R 










O XU 










"51 R 
OlO 


Tin a 


iiir 


Asp 


i^Xo. 


Leu 


Ser 


Ser 


Asp 


Ser 


r*! 11 
VjXU 


Asn 


jxieu 


Pro 


Cys 


Asp 










o n 










O R 
OZ O 










OOU 


m n 

Vj-L U. 


J- u. 


Pro 


oer 


tjxn 


Leu 


IjIU 


m 11 

IjXU 


Leu 


/ixa 


Asp 


Jrne 


jyiei. 


r^i 11 

VjXU 


Ljxh 










"3 c: 










o 4U 










Q y1 R 
O 40 


Li 611 


Txir 


Pro 


xxe 


kjXU 


Lys 


Tyr 


TV T SI 

A J. a 


Leu 


Asn 


Tyr 


Leu 


r*"\ 11 

VjXU 


Leu 


irne 










"5 R n 
O D U 










"5 R R 
O OO 










n 

o b U 


±11 S 


j.nr 


Ser 


xxe 


^jXU 


vj-xn 


VjXU 


Lys 


r*! 11 

oXU 


Arg 


Asn 


Ser 


VjXU 


Asp 


TV n "=1 
Axa 










Q C K 
ODD 










*i "7 n 
o /U 










T7 R 

o /o 






Tnr 


Axa 


vax 


Arg 


AX a 


Trp 


VjXU 


Jrne 


Tirp 


Asn 


Leu 


Lys 


Thr 










'I Q n 

-5 O U 










"3 D R 
O O O 










"3 Q n 

o y u 


Leu. 


(jin 


(jlU 


Arg 


(jXU 


AXa 


Arg 


Leu 


Arg 


Leu 


CjlU 


"1 „ 
Cjin 


VjlU 


CjIU 


TV 1 

Ala 










D cr 










4U U 










>i n c 
4Uo 




Leu. 


Leu 


Tnr 


Tyr 


Tiir 


Arg 


LrXU 


Asp 


Ala 


Tyr 


Ser 


Met 


Glu 


Tyr 




















41o 










yi o A 
4Z U 




Tyr 


IjXU 


Asp 


vax 


Asp 


iji-y 


m 1-1 
ijxn 


Tnr 


r^i 11 
^jXU 


vai 


jyiec 


Pro 


Leu 


Trp 










/IOC 










yi Q n 










yt o c 
4o O 


Trmr 


Pro 


Pro 


Tnr 


Pro 


Pro 


vjxn 


Asp 


Asp 


Ser 


Asp 


xxe 


Tyr 


Leu 


Asp 




















44o 










yi c A 
4o U 


Ser 


va.1 


Met 


Cys 


Leu 


Met 


Tyr 


CjIU 


TV n >= 
Ala 


Thr 


Pro 


lie 


Pro 


Glu 


Ala 










>1 c c 










4oU 










4d O 


Lys 


Leu 


Pro 


Pro 


vax 


Tyr 


vax 


Arg 


Lys 


VjXU 


Arg 


Lys 


Arg 


rlXS 


Lys 










/I "7 n 










4 / o 










yi o A 
4oU 


Tiur 


Asp 


Pro 


Ser 


Ala 


Ala 


Cjiy 


Arg 


Lys 


Lys 


Lys 


Gin 


Arg 


Hxs 


Cjiy 




















4y u 










yi Q c 

4y O 




AX a. 


va.x 


vax 


Pro 


Pro 


Arg 


Ser 


Leu 


Jrne 


Asp 


Arg 


TV T =a 

Axa 


Tnr 


Pro 










n: n n 
3 U U 










R n c: 
o Ud 










C T A 

olu 




Leu 


Leu 


Lys 


xxe 


Arg 


Arg 


LjXU 


LiXy 


Lys 


LrlU 


Lrin 


Lys 


Lys 


Asn 










DlD 










Ron 
o^ U 










R O R 
OZ O 




Leu 


Leu 


Lys 


(jXn 


Cjin 


vai 


Pro 


Jr'ne 


Ala 


Lys 


Pro 


Leu 


Pro 


Thr 










DO U 










R "3 c: 
O O 










c yi A 
o4U 


file 


AX a. 


Lys 


Pro 


Tiir 


Axa 


VjXU 


Pro 


vjXy 


Vj-Ln 


Asp 


Asn 


Pro 


CjlU 


Trp 










D40 










c c r\ 

ooU 










EI C 

ooo 


Leu 


xxe 


Ser 


ijXu. 


Asp 


m 

Trp 


/\xa 


Leu 


Leu 


m Tl 
vjxn 


T\ T 

Aia 


vai 


Lys 


Vjin 


Leu 










R c n 










R £r R 

obo 










A 

o / u 


Leu 


QllU 


Leu 


Pro 


Leu 


Asn 


Leu 


Tnr 


lie 


vai 


Ser 


Pro 


Ala 


His 


Thr 










C "7 c 










con 
O o U 










tz o cz 

ooo 


Pro 


Asn 


Tzrp 


Asp 


Leu 


vai 


Ser 


Asp 


vai 


vai 


Asn 


Ser 


Cys 


Ser 


Arg 










con 










c Ci n: 
by O 










ez r\ r\ 
D (J U 


zie 


Tyr 


Arg 


Ser 


Ser 


Lys 


tain 


Cys 


Arg 


Asn 


Arg 


Tyr 


CjIU 


Asn 


val 










c n R 










blU 










bio 


He 


lie 


Pro 


Arg 


CjIU 


CjIU 


^ly 


Lys 


Ser 


Lys 


Asn 


Asn 


Arg 


Pro 


Leu 










c o r\ 
bZ U 










o c 
bz o 










bo U 


Arg 


Tnr 


Ser 


CjXn 


xxe 


Tyr 


Axa 


tjXn 


Asp 


tjlU 


Asn 


Ala 


Tnr 


"LI A ri 

hLXS 


Thr 










"5 R 

boo 










b 4 U 










c yi R 
b40 




Leu 


Tyr 


Tiir 


Ser 


rilS 


Fne 


Asp 


Leu 


Twr^ 4- 

jyiec 


Lys 


Met 


Tnr 


Ala 


tiriy 










bo U 










<^ R R 

boo 










b b U 


Lys 


Arg 


Ser 


Pro 


Pro 


lie 


Lys 


Pro 


Leu 


Leu 


Cjly 


Met 


Asn 


Pro 


Phe 










c R 
boo 










b /U 










b /o 


Gin 


Lys 


Asn 


Pro 


Lys 


rllS 


Ala 


Ser 


vai 


Leu 


TV -1 _ 

Ala 


Glu 


Ser 


Gly 


lie 










boU 










COR' 

b oo 










C Q A 

by u 


Asn 


Tyr 


Asp 


Lys 


Pro 


Leu 


Pro 


Pro 


lie 


Gin 


Vai 


TV T « 

Ala 


Ser 


Leu 


Arg 










69 o 










700 










705 




VjrXU. 


Arg 


xxe 


jeiXd 


Lys 


VjX u. 


J_l_y o 


XiJ/^& 


/\xa 


Leu 


jB.xa 


Asp 


vjxn 


VjXil 










710 










715 










720 




Ala 


Gin 


Gin 


Pro 


Ala 


Vai 


Ala 


Gin 


Pro 


Pro 


Pro 


Pro 


Gin 


Pro 










725 










730 










735 


Gin 


Pro 


Pro 


Pro 


Pro 


Pro 


Gin 


Gin 


Pro 


Pro 


Pro 


Pro 


Leu 


Pro 


Gin 










740 










745 










750 


Pro 


Gin 


Ala 


Ala 


Gly 


Ser 


Gin 


Pro 


Pro 


Ala 


Gly 


Pro 


Pro 


Ala 


Val 
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■"7 d C 

/ DD 




Pro 


Gin 


irro Gin 








in Ci 
/ / U 




TV n =1 
Ala 


Pro 


Ala Lys 








T Q R 
/ OD 


"A 1 ■= 


TV 1 =a 

Ala 


vai 


i»eu Ala 








0 n n 
0 U U 




jvLet 


Pro 


inr Giy 








olD 


J- JLe 


TV 1 =1 

Ala 


Giy 


val Jb'ro 








0 o n 


lieu 


TV n 1^ 
Ala 


Ser 


Pro Val 








845 




TV 1 =. 

Ala 


Pro 


Ala Gin 








0 <c 0 
0 D U 


va.x 


Giy 


Ser 


Pro Ala 








875 


Tnr* 


Tnr 


Gin 


Giy Val 








890 




Val 


Thr 


Tiir Asn 








905 


Leu 


Val 


Pro 


Gin Val 








920 


Lys 


Tnr 


lie 


Thr Pro 








935 


Gin 


n -.--1 

(jin 


Gin 


Gin Gin 








(A cr r\ 

95 0 




ljr±n 


Gin 


Gin Gin 








y DO 


Txir 


Ser 


Gin 


Val Gin 








980 


Pro 


TV T ^ 

Ala 


Gin 


lie Lys 








n a tz 

995 




Lys 


Met 


Gin Lys 








1010 


Pro 


Pro 


Gin 


Ala Gin 








1025 


Gin 


Val 


Gin 


Tlir Ser 








1U4(J 


Tiir 


Tnr 


Val 


Tiir Ala 








1055 


Th.r 


Val 


Ala 


Asn Leu 








1070 


Gin 


Met 


Gin 


Thr Gin 








1085 


Lys 


Pro 


Pro 


Val Val 








1100 


Gxy 


vai 


Tnr 


Thr Leu 








lllD 


-Lie 


triy 


Gin 


Fro Gin 








113 0 


Ala 


Arg 


rLlS 


JxLet Gin 








1145 


Gin 


Gin 


Gin 


Lys Ala 








1160 


Ala 


Val 


Gin 


Gin Lys 








1175 


Ala 


Gin 


Gin 


Lys Val 








1190 


Gin 


Phe 


Leu 


Thr Thr 








1205 


Ala 


Gin 


Gin 


Val Gin 








1220 









'~l C C\ 


Pro 


Gin 


Pro 


Gin mr 










Ala 


Gin 


Pro 


Ala lie 








•7Q n 


Giy 


Thr 


lie 


Lys Thr 








0 A C 


Ala 


Val 


Ser 


Giy Asn 








Q 0 n 


Ala 


Ala 


Thr 


pne Gin 








0 "3 C 


TV 1 ^ 

Ala 


Pro 


Giy 


Ala Leu 








0 c rv 


Val 


Val 


His 


Thr Gin 








0 <r IT 

865 


Thr 


Ala 


Thr 


Pro Asp 








0 0 r\ 
880 


Arg 


Ala 


Val 


Thr Ser 








895 


Leu 


Thr 


Pro 


Val Gin 








91U 


Ser 


Gin 


Ala 


Thr Giy 








A 0 cr 

925 


Ala 


His 


Phe 


Gin Leu 








940 


Gin 


Gin 


Gin 


Gin Gin 








A C C 

955 


Gin 


Gin 


Gin 


Gin Gin 








Q T n 


Val 


Pro 


Gin 


lie Gin 








Q 0 C 


Ala 


Val 


Giy 


Lys Leu 








1 A A A 
lUUU 


Gin 


Lys 


T Alt 

Leu 


Gin Meu 








1 A 1 C 
lUlD 


Ser 


Ala 


Pro 


Pro Gin 








1 AO A 

1030 


Gin 


Pro 


Pro 


Gin Gin 








1045 


Pro 


Arg 


Pro 


Giy Ala 








1 A ^ A 

1060 


Gin 


Val 


Ala 


Arg Leu 








1075 


TV T 

Ala 


Pro 


Gin 


Pro Ala 








1 AAA 

1090 


Ser 


Val 


Pro 


Ala Ala 








1 1 A C 

1105 


Pro 


Met 


Asn 


Val Ala 








1 1 0 A 

Hz U 


Lys 


TV n 

Ala 


TV 1 _ 

Ala 


Giy Gin 








1135 


Gin 


Leu 


Leu 


Lys Leu 








1 1 cr A 

1150 


He 


Gin 


Pro 


Gin Ala 








1165 


He 


Thr 


Ala 


Gin Gin 








1180 


Ala 


Tyr 


Ala 


Ala Gin 








1195 


Pro 


He 


Ser 


Gin Ala 








1210 


Thr 


Gin 


He 


Gin Val 








1225 









/ OD 


Gin 


Pro 


Gin 


Fro val 








"7 p n 
/ 0 u 


Tnr 


Tnr 


Vaxy 


o-Ly oer 








/ y D 


Ser 


vai 


Tnr 


\j±y 1 nr 








CD X u 


vai 


±±e 


va± 


Asn inr 








Q 0 


Ser 


lie 


Asn 


Lys Arg 








Q /I n 


Thr 


Thr 


Pro 


Giy Giy 








Q R 


Pro 


Pro 


Pro 


Arg Ala 








OTA. 


Leu 


vai 


Ser 


Meu Ala 








0 OD 


Val 


Thr 


Ala 


Ser Ala 








AAA 

y uu 


Thr 


Pro 


TV T -a 

Ala 


Arg Ser 








QIC 

y Id 


Val 


Gin 


Leu 


Pro Giy 








A '3 A 

93 0 


Leu 


Arg 


Gin 


Gin Gin 








y 45 


Gin 


Gin 


Gin 


Gin Gin 








Q C A 

y 0 u 


Gin 


Gin 


Thr 


Thr Thr 








Q "7 

y /D 


Giy 


Gin 


TV n 

Ala 


Gin oer 








Q Q n 
y y u 


Thr 


Pro 


GlU 


His Leu 








1 A A C 

lU Ud 


Pro 


Pro 


Gin 


Pro Pro 








1 A 0 A 

lUz u 


Pro 


Thr 


TV 1 -1 

Ala 


Gin vai 








I AO C 

1035 


Gin 


Ser 


Pro 


Gin Leu 








1 A C A 

lUb U 


Leu 


Leu 


Thr 


Giy Thr 










Leu 


Gin 


Ala 


Gin Giy 








1080 


Gin 


Val 


Ala 


Leu Ala 








•1 A A C 

1095 


Val 


Val 


Ser 


Ser Pro 








1 "1 1 A 
lllU 


Giy 


lie 


Ser 


val Ala 








llzo 


Thr 


vai 


vai 


Ala Gin 








1 1 yi A 
114U 


Lys 


Gin 


Gin 


Ala val 








1155 


TV 1 ^ 

Ala 


Gin 


Giy 


Pro Ala 








T 1 T A 
11 / U 


lie 


Thr 


Thr 


Fro Giy 








1185 


Pro 


Ala 


Leu 


Lys Thr 








1200 


Gin 


Lys 


Leu 


Ala Giy 








1215 


Ala 


Lys 


"Leu 


Pro Gin 








1230 
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Val 


Val 


Gin 




J. 111. 


ir J- w 


Val 




J_ _L t; 




Gin 


Va.1 Ala 
















1240 








1245 


Ssnr 


Ala 


Ser 


Gin Gl n 






Pro 


Gin Thr 


V Cl_L 


AT a 


Leu 


Thr Gin 








1250 








1255 








1260 


Ala 


Thir 


Ala 


Ala G1 \7 




Gln 


Val 


Gin Me»f 


He 


Pro 


Ala 


Val Thr 

V Ci^ ^ .1.x XJ., 








_L ^ J 








1270 








1275 


Ala 




Ala 


Gin V?^ 1 


V CL JL 


Gin 


Gin 




He 


Gin 


Gin 


m Ti Va 1 

V-TJ.J.X V d J- 








1280 








1285 








1290 


Val 


Thr 


Thr 


Ala Ser 


Ala 


Pro 


Leu 


Gin Thr 


Pro 


Glv 


Ala 


Pro Asn 








1295 








1300 








1305 


Pro 


Ala 


Gin 


Val Pro 


Ala 


Ser 


Ser 


Asp Ser 


Pro 


Ser 


Gin 


Gin Pro 








1310 








1315 








1320 


Lys 


Leu 


Gin 


Met Arg 


Val 


Pro 


Ala 


Val Arg 


Leu 


Lys 


Thr 


Pro Thr 








1325 








1330 








1335 


Lys 


Pro 


Pro 


Cys Gin 



















1340- 

<210> 5 

<211> 560 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> iuisc_f eature 

<223> Incyte ID No: 3464492CD1 

<400> 5 



Met 


Thr 


,Phe 


Ser 


Arg 


Leu 


Leu 


Asn 


Tyr 


Lys 


Tyr 


Ser 


Asp 


Thr 


Leu 


1 








5 










10 










15 


Lys 


Lys 


Met 


Asp 


Pro 
20 


Asp 


His 


Leu 


Val 


TV 1 — 

Ala 
25 


Leu 


Val 


Thr 


Glu 


Val 
30 


He 


Pro 


Asn 


Tyr 


Ser 
35 


Cys 


Leu 


Val 


Phe 


Cys 
40 


Pro 


Ser 


Lys 


Lys 


Asn 
45 


Cys 


Glu 


Asn 


Val 


Ala 
50 


Glu 


Met 


He 


Cys 


Lys 
55 


Phe 


Leu 


Ser 


Lys 


Glu 
60 


Tyr 


Leu 


Lys 


His 


Lys 
65 


Glu 


Lys 


Glu 


Lys 


Cys 
70 


Glu 


Val 


He 


Lys 


Asn 
75 


Leu 


Lys 


Asn 


He 


Gly 

80 


Asn 


Gly 


Asn 


Leu 


Cys 

85 


Pro 


Val 


Leu 


Lys 


Arg 

90 


Thr 


He 


Pro 


Phe 


Gly 
95 


Val 


Ala 


Tyr 


His 


His 
100 


Ser 


Gly 


Leu 


Thr 


Ser 
105 


Asp 


Glu 


Arg 


Lys 


Leu 
110 


Leu 


Glu 


Glu 


Ala 


Tyr 
115 


Ser 


Thr 


Gly 


Val 


Leu 
120 


Cys 


Leu 


Phe 


Thr 


Cys 
125 


Thr 


Ser 


Thr 


Leu 


Ala 
130 


Ala 


Gly 


Val 


Asn 


Leu 
135 


Pro 


Ala 


Arg 


Arg 


Val 
140 


He 


Leu 


Arg 


Ala 


Pro 
145 


Tyr 


Val 


Ala 


Lys 


Glu 
150 


Phe 


Leu 


Lys 


Arg 


Asn 
155 


Gin 


Tyr 


Lys 


Gin 


Met 
160 


He 


Gly 


Arg 


Ala 


Gly 
165 


Arg 


Ala 


Gly 


He 


Asp 
170 


Thr 


He 


Gly 


Glu 


Ser 
175 


He 


Leu 


He 


Leu 


Gin 
180 


Glu 


Lys 


Asp 


Lys 


Gin 
185 


Gin 


Val 


Leu 


Glu 


Leu 
190 


He 


Thr 


Lys 


Pro 


Leu 
195 


Glu 


Asn 


Cys 


Tyr 


Ser 

200 


His 


Leu 


Val 


Gin 


Glu 

205 


Phe 


Thr 


Lys 


Gly 


He 
210 


Gin 


Thr 


Leu 


Phe 


Leu 
215 


Ser 


Leu 


He 


Gly 


Leu 
220 


Lys 


He 


Ala 


Thr 


Asn 

225 


Leu 


Asp 


Asp 


He 


Tyr 
230 


His 


Phe 


Met 


Asn 


Gly 
235 


Thr 


Phe 


Phe 


Gly 


Val 
240 


Gin 


Gin 


Lys 


Val 


Leu 
245 


Leu 


Lys 


Glu 


Lys 


Ser 
250 


Leu 


Trp 


Glu 


He 


Thr 
255 


Val 


Glu 


Ser 


Leu 


Arg 
260 


Tyr 


Leu 


Thr 


Glu 


Lys 
265 


Gly 


Leu 


Leu 


Gin 


Lys 
270 
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Asp Thr lie Tyr Lys Ser Glu Glu Glu Val Gin Tyr Asn Phe His 

275 280 285 

He Thr Lys Leu Gly Arg Ala Ser Phe Lys Gly Thr He Asp Leu 

290 295 300 

Ala Tyr Cys Asp He Leu Tyr Arg Asp Leu Lys Lys Gly Leu Glu 

305 310 315 

Gly Leu Val Leu Glu Ser Leu Leu His Leu He Tyr Leu Thr Thr 

320 325 330 

Pro Tyr Asp Leu Val Ser Gin Cys Asn Pro Asp Trp Met He Tyr 

335 340 345 

Phe Arg Gin Phe Ser Gin Leu Ser Pro Ala Glu Gin Asn Val Ala 

350 355 360 

Ala He Leu Gly Val Ser Glu Ser Phe He Gly Lys Lys Ala Ser 

365 370 375 

Gly Gin Ala He Gly Lys Lys Val Asp Lys Asn Val Val Asn Arg 

380 385 390 



Leu 


Tyr 


Leu 


Ser 


Phe 


Va± 


Leu 


Tyr 


Thr 


Leu 


Leu 


Lys 


Glu 


Thr 


Asn 










395 










400 










405 


He 


Trp 


Thr 


Val 


Ser 


Glu 


Lys 


Phe 


Asn 


Met 


Pro 


Arg 


Gly Tyr 


He 










410 










415 








Va!l 


420 


Gin 


Asn 


Leu 


Leu 


Thr 


Gly 


Thr 


Ala 


Ser 


Phe 


Ser 


Ser 


Cys 


Leu 










425 










430 










435 


His 


Phe 


Cys 


Glu 


Glu 


Leu 


Glu 


Glu 


Phe 


Trp 


Val 


Tyr 


Arg 


Ala 


Leu 










440 










445 










450 


Leu 


Val 


Glu 


Leu 


Thr 


Lys 


Lys 


Leu 


Thr 


Tyr 


Cys 


Val 


Lys 


Ala 


Glu 










455 










460 










'465 


Leu 


He 


Pro 


Leu 


Met 


Glu 


Val 


Thr 


Gly Val 


Leu 


Glu 


Gly Arg 


Ala 










470 










475 










480 


Lys 


Gin 


Leu 


Tyr 


Ser 


Ala 


Gly 


Tyr 


Lys 


Ser 


Leu 


Met 


His 


Leu 


Ala 










485 










490 










495 


Asn 


Ala 


Asn 


Pro 


Glu 


Val 


Leu 


Val 


Arg 


Thr 


He 


Asp 


His 


Leu 


Ser 










500 










505 










510 


Arg 


Arg 


Gin 


Ala 


Lys 


Gin 


He 


Val 


Ser 


Ser 


Ala 


Lys 


Met 


Leu 


Leu 










515 










520 










525 


His 


Glu 


Lys 


Ala 


Glu 


Ala 


Leu 


Gin 


Glu 


Glu 


Val 


Glu 


Glu 


Leu 


Leu 










530 










535 










540 


Arg 


Leu 


Pro 


Ser 


Asp 


Phe 


Leu 


Val 


Leu 


Trp 


Leu 


Leu 


Pro 


Leu 


Thr 










545 










550 










555 


Lys 


His 


Glu 


Ala 


He 























560 

<210> 6 
<211> 436 
<212> PRT 
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<220> 
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<223> Incyte ID No: 1794336CD1 

<400> 6 



Met 


Glu 


Glu 


Phe 


Lys 


Ser 


His 


Ser 


Pro 


Glu 


Arg 


Ser 


He 


Phe 


Ser 


1 








5 










10 










15 


Ala 


He 


Trp 


Glu 


Gly 

20 


Asn 


Cys 


His 


Phe 


Glu 

25 


Gin 


His 


Gin 


Gly 


Gin 
30 


Glu 


Glu 


Gly 


Tyr 


Phe 
35 


Arg 


Gin 


Leu 


Met 


He 
40 


Asn 


His 


Glu 


Asn 


Met 
45 


Pro 


He 


Phe 


Ser 


Gin 
50 


His 


Thr 


Leu 


Leu 


Thr 
55 


Gin 


Glu 


Phe 


Tyr 


Asp 
60 


Arg 


Glu 


Lys 


He 


Ser 
65 


Glu 


Cys 


Lys 


Lys 


Cys 
70 


Arg 


Lys 


He 


Phe 


Ser 
75 


Tyr 


His 


Leu 


Phe 


Phe 
80 


Ser 


His 


His 


Lys 


Arg 
85 


Thr 


His 


Ser 


Lys 


Glu 
90 
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Leu 


Ser 


Glu 


Cys 


Lys 


Leu 


Phe 


Lys 


Gin 


Gin 


Cys 


Lys 


Glu 


Cys 


Trp 

1 9 R 


His 


Leu 


Arg 


He 


His 

_L f± U 


Cys 


Gly 


Lys 


Ala 


Phe 


Arg 


He 


His 


Thr 


Gly 

J. / U 


Lys 


Ala 


Phe 


Arg 


Gin 

1 Q R 
J. O D 


His 


Thr 


Gly 


Glu 


Lys 

o n n 
z u u 


Phe 


He 


Arg 


Gly 


Phe 

O 1 R 


Gly 


Glu 


Lys 


Pro 


Tyr 
o T n 


His 


Arg 


Ser 


His 


Leu 

O /I R 


Lys 


Pro 


Tyr 


Glu 


Cys 
z o U 


Ser 


Ser 


Phe 


Ser 


His 

O '7 R 


Tyr 


Glu 


Cys 


His 


Glu 

o Q n 
z y u 


Leu 


Thr 


Leu 


His 


Gin 
n R 


Cys 


Lys 


Glu 


Cys 


Gly 
J z u 


Arg 


His 


Gin 


Arg 


He 

R 

J) J D 


lie 


Cys 


Gly 


Lys 


Ala 
"5 R n 


Gin 


Arg 


He 


His 


Thr 

"5 R 
J D D 


Gly 


Lys 


Ala 


Phe 


Ser 

O O A 


lie 


His 


Ser 


Gly 


Lys 
395 


His 


Arg 


Leu 


Gin 


Leu 
410 


Lys 


Pro 


Val 


Arg 


Phe 
425 



Ser 



Glu 


Cys 


Thr 


Glu 


He 
100 


Thr 


He 


Gin 


Asn Gly 










115 


Lys 


Ala 


Phe 


V CLJ. 


TTn CI 
n JL o 

130 


Asn 


Gly 


Glu 


Lys 


7\ -yrr 

145 


Asn 


Tyr 


Gly 




vjj. U. 

160 


Glu 


Lys 


Pro 


Tyr 


<of -LU 

175 


Arg 


Ser 


Gin 


Leu 


Thr 

190 


Pro 


Tyr 


\jJ_U 


Cys 


Lys 
205 




Leu 


TTUi -v- 
X 111. 


Glu 


His 
220 


Glu 


Cys 


Lys 


Glu 


Cys 
235 


Thr 


He 


His 


Gin Arg 










250 


Arg 


vj J-U 


Cys 


Gly Lys 










265 


His 


Gin 


Lys 


He 


His 
280 


Cys 


Gly 


Lys 


Ala 


Phe 
295 


Arg 


He 
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Lys 


Thr 


Phe 


Arg 


Gin 
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Thr 


Gly 


Glu 


Lys 
340 


Phe 


Arg 


Leu 
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Ser 
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Gly 


Glu 


Lys 


Pro 


Tyr 
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Tyr 


His 


Ser 


Ser 


Phe 
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Lys 


Pro 


Tyr 


Gin 


Cys 
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Asn 


Leu 


His 


Gin 


Thr 
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Pro 
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Leu 


Pro 
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430 
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Cys 
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Lys 


Cys 


Asn 


Glu 
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Ser 


His 


Phe 
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Glu 
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Leu 


Thr 
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His 


Gin 
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Cys 
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Cys 


Gly 
JLo u 
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Gin 


Arg 


Leu 

1 Q R 
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Lys 


Ala 
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Arg 


Leu 
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n n 
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n 
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He 
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Glu 
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Gin 
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Leu 
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Ala 
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Asn He 
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Ser Pro 
30 



<400> 7 

Met Asp Arg Asp Leu Glu Gin Ala Leu Asp Arg Thr Glu 

15 10 
Thr Glu He Ala Gin Gin Arg Arg Pro Arg Arg Arg Tyr 
20 25 
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Met Lys Arg Arg Leu 
1 5 
Gin Gin Arg Arg He 
20 

His Arg Val Leu Ala 
35 

Glu Thr Met Gin Ser 
50 

Ser Tyr Gin Val Ser 
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Met Lys Glu 
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Ser Met Ala 
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Ser Arg Ala 
Phe Pro Glu 
Glu Ser Val 
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J VJ VJ 


Xj^o 


Tl ^ 
X X c 


XJ_y 0 






"^1 R 

0 X J 


xrxlc: 


XJC:Ll 


V7X LX 






0 0 \J 


Lys 


fill n 


A1 ^ 






AR 
0 ft 3 


V ctx 


j.yx 


Ala 






0 

0 0 u 


T 

Leu 


Ot=X 


m n 

VjtX LX 






"7 R 


Leu 


J_l tr: U. 


ot=x 






QO 

0 J7 VJ 




XXX >3 


m vr 
vjx_y 






AOR 
ft U 3 


VirXXl 


Arg 


Jrx U 






A9 0 

ft^ U 


vaxy 


± XXX 


TVi T" 

X XXX 






A'^ R 
ft J O 


Lys 




OfcJX 






AR 0 
ft J Lf 


VjX U. 


0 tJX 


Leu 






Af^ R 

ft 0 3 


VjX LX 


fa-Xd 


lyr 






ftOU 


CjXU 


vax 


Tl c» 

xxe 






/I 0 R 

fty 3 


Leu 


Gly 


Lys 






510 


Gly 


Tyr 


Lys 






525 


Ala 


Thr 


Glu 






540 
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\jJLY 


J_ j_ 


.rA_L a. 




(Zi 1 1 

VjX LL 


Tl 

xxe 




lyr 


Al ^^9 


OfciX 




Lys 




Leu 


ml \r 










3 f± O 










S R 0 
3 D \J 










RRR 
J 3 -J 




OCX. 


±yj. 


Arg 


Al P» 


Leu 


irX VJ 




otsx 


xyx 


al n 

\3X11 


Gin 


Pro 


xi_y k3 


Cys 










«J Q w 




















•J I \j 






■"■J-y 


X IIX 


irX 0 


Leu 


v^ys 


Lys 


rtl 11 

VjX LI 


'\7pi 1 
V Ct X 


Leu 


A en 


A en 


±11X 


irp 




















-.J 0 \j 










•J 0 «j 


V CLX 


Ot=J- 




It J_ 


oc=x 


irp 


vDcJX 


Pl n 
VjX LJ. 


A Gl^ 


Ofci^X 


X llX 


Jcriic: 


V^l 

V CLX 


OCX 


0 csx 










Q n 

3 -/ VJ 










R Q R 
J 3 










0 0 
\j \j \j 


T Area 


JUyta 


J. llJ. 


Vrrj_ll 


lyr 


fill n 
O'X LL 


r!l 11 
vjX U. 


llXo 


Tl pa 

X X 


j.yx 


A "K*!^ 

-Hixy 


V— y 0 


al 11 

virX Li. 


A cnk 


Glu 










0 U J 










fil 0 
0 X vJ 










fil R 

0 X J 






VjjX Li. 


Leu 


Asp 


\Tsi 1 
Va.X 


V a.X 


Leu 


Cll n 
VjX u. 


± IIX 


A en 


Leu 


Ala 


±llx 


Tl 
X X t; 










u 










0 ^ D 










0 0 u 




V CtJ. 




ni n 

J_ Ll 


Al « 


Tl «=k 


f^l n 

VjXll 


T.T7"e 

xiy fc> 




XI G Li. 


OtIX 


Avrr 


XJ Li 


OCX 


Ala 










D J D 










0 ^ u 










O^D 








2\1 a 


Lys 


jfiie 


Arg 


Leu 


Asp 


Asn 


TJnr 


T.QII 

1j6(X 


ml -XT 


PI 

kjxy 


rnVi -u— 
1 ilx 










<^ n 










R R 

ODD 










fi n 

0 D U 






V CL J- 


Tl *= 

X X e 


JtlXS 


Arg 


Lys 


Al ;=! 
AX a. 


Leu 


ml n 




Tl ^ 
X X ti 


xyr 


Ala 


A onk 










ODD 










D / U 










fi7 R 
D / D 






AT a 

/Tl JL CL 


7\ GT-^ 


Tl <=> 

J.X C7 


Tl <=» 

JL X 


A or^ 


rji \r 
wxy 


T lOi 1 


jnX y 


Lys 


Acjn 

AOll 


It X vJ 


ocx 


Tl 
xxe 










<^ Q n 

D 0 U 










000 










D 17 U 




Va.J- 


Pro 


Tl tfa 
XX6 


\7a 1 




Lys 


Arg 


T. Al 1 


Lys 


1*161^ 


T.^re 
Xiyto 


ml n 


Pi n 
LjX Li. 


ml n 

V7X Li. 










C Q 










"inn 










7n R 

/ UD 




Arg 


r*! 11 


AX a 


kjxri 


Arg 


'jrxy 


fne 


Asn 


Lys 


vax 


Trp 


Arg 


Pi 11 


PI n 










Tin 
/ X u 










/ X D 












Ash 




T 

Lys 


m 

Tyr 


Tyr 


Leu 


Lys 


Ser 


Leu 


Asp 


xlxS 


m n 


kjxy 


Tl ca 

xxe 


Asn 










/ZD 










"7 n 










7 R 
/ 0 D 










Asp 


±xL±. 




V a.X 


T .01 1 


"•X y 




Lys 


oex 


T. A1 1 


XitJLi. 




















1 

/ 'iZj 










7 R n 




m n 

ox LL 


J- X c 


w JL UL 


0 c:l_ 


lie 


J.yx 


A CST^ 


fJl 11 
vjX Ul 


r\l. y 


nl n 


Glu 


ml n 

V3X11 


Al « 


Thr 










/ D D 










7 fin 












Vjj X Li. 


ni 11 


7\ C3 Tl 


Al a 


^jxy 


Vdx 


Pro 


V ax 


ni -^r 


Pro 


nxo 


T.01 1 


Q 0 V 

>D C^X 


Leu 


Al a 










TT n 










■7 T R 










7 p n 
/ 0 u 




11 




Lys 


vaXii 


Tl 

X X c: 


Leu 


nl n 

vdtX LI 






A 1 

i-i.Xct 


Al 


Leu 


Tl f=» 

X X ti 


Tl <=» 
X X tr 










"7 Q 

/ 0 D 










7 Q n 










7 Q R 

/ J7 D 






VCLX 


Lys 


Arg 


PI n 
vjXil 


± llX 


r«l ^7- 
vaxy 


Tl <=i 
X X 


nl n 

0x11 


Lys 


PI 11 

V7X Li 


Asp 


T 

Lys 


lyr 










Q n n 

oU u 










Q n R 

0 UD 










ox u 


T 


JL JL6 


Lys 


r*i -n 


Tl 
XX6 


jyiclL. 


Wn e 




■DV\o 
Jriics 


Tl 
X X ti 


Pro 


Asp 


Leu 


Leu 


Jtrllc 










oXD 










p*? n 
oz u 










P9 R 
oZ D 






TV 

Arg 


Ol 

vjxy 


Asp 


T 

Leu 


Ser 


Asp 


T73=i 1 


Pl n 
v^X U. 


ni n 


ml n 
vjX u. 


pi 11 


pi 11 

VjXU 


ml 11 

XjJL Li 










00 u 










00 D 










Pi^ A 
0 4fc U 






Asp 


T7sa 1 

Va.X 


Asp 


kjrXU 


Al a 


TJtir 


oxy 


Al ^ 


vax 


Lys 


Lys 


W-i e 

uxs 


Asn 










Q / cr 










Q R n 

0 D U 










P R R 
c3 D D 




vciJ. 


VaXy 


o-xy 


Ser 


Pro 


Pro 


Lys 


Ser 


Lys 


Leu 


Leu 


Jrllc 


Ser 


Asn 










0 0 U 










Q R 
0 OD 










P7 n 
0 / u 






i-i-Lo. 


r^l -r-i 

vjxn 


Lys 


Leu 


Arg 


r*l -ir 

oxy 




Asp 


r*i n 


■\7a 1 

vax 


m 

Tyr 


Asn 


Leu 










0/0 










R R n 
0 0 U 










PP R 
0 OD 


JrXie 


m 

Tyr 


VclX 


Asxi 


Asn 


A en 


irp 


iyr 


Tl #a 
X X ti 


xrXl^ 


lYie u 


Arg 


Leu 


Wt e 

nx 0 


ml n 










ft Q n 










RQR 

0 J? D 










Qnn 

_? VJ u 


J- J- ^ 






Leu 


Arg 


Leu 


Leu 


AT~rT 

xix y 


Tl f= 




0 tJX 


Gin 


Ala 


ml n 

VjX LL 


■tiX y 










Q Pi ^ 










X L/ 










Q1 R 

J7 X D 


vjJLIi 




rsT 11 


p'l 1 1 


r*i 11 


A en 




al n 

oXLl 


Arg 


ml 11 


m 

Trp 


Pin 
LjX Li 


Arg 


ml 11 


'^7a 1 
V dX 










Q 0 n 










Q 0 R 

y z D 










0*5 n 
y J u 




VarJ-y 


X JL 6 


T 

Lys 


Arg 


Asp 


Lys 


oSx 




oer 


Pro 


Ala 

r%xa 


Tl <=» 

xxe 


Pl n 

oxn 


T .cil 1 










y 0 D 










Q /I n 

y 41: u 










QAR 
y ^tD 


Arg 


Leu. 


Lys 


LjXU 


Pro 


jMeu 


Asp 


vax 


Asp 


vax 


ljXU 


Asp 


Tyr 


Tyr 


Pro 










Q c: n 

y D u 










955 










960 




Ph.e 


Leu 




Met 


Val 


Arg 


Ser 


Leu 


Leu 


Asp 


VJJL_Y 


Asn 


lie 


Asp 










965 










970 










975 


Ser 


Ser 


Gin 


Tyr 


Glu 


Asp 


Ser 


Leu 


Arg 


Glu 


Met 


Phe 


Thr 


lie 


His 










980 










985 










990 


Ala 


Tyr 


lie 


Ala 


Phe 


Thr 


Met 


Asp 


Lys 


Leu 


lie 


Gin 


Ser 


lie 


Val 










995 








1000 








1005 


Arg 


Gin 


Leu 


Gin 


His 


lie 


Val 


Ser 


Asp 


Glu 


lie 


Cys 


Val 


Gin 


Val 



14/78 



wo 03/006618 



PCT/US02/21971 









1 m n 
J- U X u 




Asp 


Leu 


±yr jjiBu 








1 no R 




Leti 


Asn 
















Lys 










1 n R R 

J. U 3 D 






Jriie 










J- u / u 




Leu 


Asp 


±iir VjJ_\l 












Arg 


Trp 


osr Asp 








J. J.UU 




Seir 


Pro 


VaXU jjeu 








1 1 T R 


Leu 


Pro 


Arg 


TV ^^^^ X T 








-L±o U 


VjJLUl 






kjjLij. j_iys 








1 1 /I R 


jyie u 




Asxi 










±x o u 


Leu 


Ash 


Ser 


1 yr i-iy s 








1 1 "7 R 


T\/r^ 4- 

Meti 


Tyr 


Arg 


Arg Thr 








lion 

xxy u 


Arg" 


va.x 


Ser 


Xjys Arg 








T o n R 

Xi6 VJ D 


Lys 


Trp 


Trir 


Lys Glu 








Xzz u 


Ser 


Lys 


Trp 


Leu Met 








1235 


Thr 


Thr 


Thr 


Cys Asp 








1250 


Lys 


Tyr 


Arg 


Val Lys 








1265 



<210> 9 

<211> 381 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> misc_feature 
<223> Incyte ID No: 



<400> 9 








Met 


Glu 


Pro 


Leu 


Thr 


1 








5 


Glu 


Glu 


Trp 


Gin 


Cys 










20 


Asn 


Val 


Leu 


Leu 


Glu 










35 


Ala 


Val 


Ser 


Lys 


Pro 










50 


Glu 


Pro 


Trp 


Asn 


He 










65 


Val 


Met 


Ser 


Phe 


His 










80 


lie 


Lys 


Asp 


Ser 


Phe 










95 


Cys 


Glu 


Tyr 


Glu 


Asn 










110 


Asp 


Glu 


Cys 


Thr 


Gly 









1 m R 

X UX 3 


AX a 


VjXU 


Asn 


Aon ASIi 








X u 


Asn 


Ser 


Arg 


Ser Leu 








X o 


VjrXil 


T 

Leu 


Mot* 










X U D u 


Ser 


VcrXn 


(jxy 


m r-» T7"^i 1 








1 nT R 

X U / D 


vjJ.U 


IjXU 


Asn 










1 n Q n 
X u y u 


Tyr 


vax 


oXU 


Arg lyr 








1 1 nR 

XXU3 


Arg 


vjXU 


ilXS 


xieu AX a 








lion 
xxz u 


Arg 


Arg 


xxe 


Arg Lys 








1 1 T R 
XX J> 3 


vjXU 


vjXy 


Lys 


vjXU vjxy 








1 1 R A 

XXd u 


Ser 


Leu 


Asp 


Lys Leu 








1 1 C R 
XXOD 


Jxieu 


vax 


Tyr 


vax xxe 








lion 
XXo U 


Axa 


Leu 


Leu 


Arg AX a 








1 1 O R 

xxy 3 


Leu 


rLlS 


LrXn 


Arg Fne 








loin 
XzX U 


rlXS 


vax 


Pro 


"TV /~>r /"^ n T T 

.nJL g VjXVI 








T O O R 


Gly 


Glu 


Gly 


Leu Glu 








1240 


Thr 


Glu 


Thr 


Leu His 








1255 


Tyr 


Gly 


Thr 


Val Phe 








1270 



4019390CD1 



Phe 


Lys 


Asp 


Val 


Ala 










10 


Leu 


Asp 


Thr 


Ala 


Gin 










25 


Asn 


Tyr 


Arg 


Asn 


Leu 










40 


Tyr 


Leu 


He 


Thr 


Cys 










55 


Lys 


Arg 


His 


Glu 


Met 










70 


Phe 


Ala 


Gin 


Asp 


Leu 










85 


Gin 


Lys 


Val 


Thr 


Leu 










100 


Leu 


Gin 


Leu 


Arg 


Lys 










115 


His 


Lys 


Gly 


Gly 


His 



15/7 



1020 



VjXy 


Ala 

AX a 


Thr 


Vjxy Vjxy 










T.oi 1 


vjj J. U. 












X U ^ \J 


VaXU 


Asn 












X \J o ^ 


vjxn 


Leu 


Tiir 


xxe Vjxu 








1080 




szjL KJ 


V CtJL 










1 n Q 

X U J? O 


jxie L- 


Asn 


Ser 


Asp inr 








1 1 1 n 

XX X w 


VjrXn 


Lys 


Pro 


vax Jrne 








1 1 9 R 

XX ^ >J 


Cys 


VrXri 


Arg 


Vaxy Arg 








1 1 /I n 

XX4tU 


Asn 


Ser 


Lys 


Xjy s JL nr 








XX3 ^ 


vjXU 


Cys 


Arg 


Phe Lys 








XX / u 


Lys 


Ser 


tjXU 


Asp Tyr 








1 1 Q R 
XX 03 


xllS 


/^T -1-1 

vjxn 


Ser 


HXS vjXU 








1 o n n 
XZ u U 


vjxn 


Axa 


m 

Trp 


vax Asp 








1215 


Met 


Ala 


Ala 


Glu Thr 








1230 


Gly 


Leu 


Val 


Pro Cys 








1245 


Phe 


Val 


Ser 


lie Asn 








1 9 K n 

X/^ o u 


Lys 


Ala 


Pro 




xxe 


vjXU 


pne 


Ser Leu 








1 R 

X 3 


Arg 


Asp 


Leu 


Tyr Arg 








o u 


vax 


jfne 


Leu 


tjxy xxe 










Leu 


CjXU 


VirXn 


Xiys Liys 








D u 


Vax 


Axa 


Lys 


Pro Pro 








n R 

/ 3 


Trp 


Pro 


Glu 


Gin Asn 








90 


Arg 


Arg 


Tyr 


Gly Lys 








105 


Gly 


Cys 


Lys 


His Val 








120 


Asn 


Thr 


Val 


Asn Gin 
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1 o tr 

±A b 


Cys 


Leu 


Thr 


Ala 


Thr 


Val 


Lys 


Val 


Phe 


Asp 


Arg 


His 


Thr 


Gly 


Asn 
1 '7 n 


Ser 


Phe 


Cys 


Val 


Leu 

J.OD 


Thr 


Arg 


Val 


Asn 


Ser 

^UU 


Asn 


Trp 


Phe 


Ser 


Thr 


Glu 


Lys 


Pro 


Tyr 


Lys 

o o rv 

23 0 


Ser 


Ser 


Gin 


Leu 


Thr 


Pro 


Asn 


Lys 


Cys 


Glu 

OCA 


His 


Leu 


Thr 


He 


His 

o *~7 nr 


Lys 


Tyr 


Glu 


Glu 


Cys 

o ri r\ 

29 0 


Thr 


Thr 


Gin 


Lys 


He 
3 05 


Lys 


Glu 


Cys 


Gly 


Lys 

•3 O 

3z 0 


His 


Lys 


Arg 


lie 


His 

•3 Q C 

Job 


Cys 


Gly 


Arg 


Ala 


Phe 
350 


Lys 


He 


His 


Thr 


Gly 
365 


Lys 


Leu 


Leu 


Thr 


Asp 
380 



<210> 10 
<211> 290 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_f eature 
<223> Incyte ID No: 

<400> 10 



Met 


Ser 


Ser 


Glu 


Ala 


1 








5 


Leu 


Ser 


Ala 


Ala 


Asp 
20 


Gly 


Ser 


Gly 


Gly 


Pro 
35 


Gly 


Asp 


Lys 


Lys 


Val 
50 


Trp 


Phe 


Asn 


Val 


Arg 
65 


Thr 


Lys 


Glu 


Asp 


Val 
80 


Val 


Thr 


Gly 


Pro 


Gly 
95 


Ala 


Asp 


Arg 


Asn 


His 
110 


Pro 


Arg 


Asn 


Tyr 


Gin 











±3 U 


Pro 


Ser 


Lys 


He 


Phe 
14b 


Lys 


Phe 


Ser 


Asn 


Ser 
± o U 


Lys 


His 


Phe 


Lys 


Cys 

JL /b 


Ser 


Gin 


Leu 


Thr 


Gin 
ly u 


Tyr 


Lys 


Cys 


Glu 


Glu 

O A C 

205 


Leu 


Thr 


Lys 


His 


Lys 

o o n 


Cys 


Glu 


Glu 


Cys 


Gly 

o o tr 
Z J 5 


Arg 


His 


Lys 


He 


He 
250 


Glu 


Cys 


Gly 


Lys 


Ala 

o /T cr 

zo5 


Lys 


He 


He 


His 


Thr 

o o n 

280 


Gly 


Lys 


Val 


Phe 


Ser 

295 


Leu 


His 


Thr 


Gly 


Glu 
310 


Ala 


Phe 


Asn 


Leu 


Phe 

•3 A C 

325 


Ala 


Gly 


Glu 


Lys 


Pro 
340 


Asn 


He 


Ser 


Ser 


Asn 
355 


Gly 


Lys 


Leu 


Asn 


Lys 
370 


Pro 











986452CD1 



Glu 


Thr 


Gin 


Gin 


Pro 










10 


Thr 


Lys 


Pro 


Gly 


Thr 










25 


Gly 


Gly 


Leu 


Thr 


Ser 










40 


He 


Ala 


Thr 


Lys 


Val 










55 


Asn 


Gly 


Tyr 


Gly 


Phe 










70 


Phe 


Val 


His 


Gin 


Gly 










85 


Gly 


Val 


Pro 


Val 


Gin 










100 


Tyr 


Arg 


Arg 


Tyr 


Pro 










115 


Gin 


Asn 


Tyr 


Gin 


Asn 











135 


Gin 


Cys 


Asn 


Lys 


Tyr 

T C A 

IbO 


Asn 


Arg 


Tyr 


Lys 


Arg 

lob 


Lys 


Glu 


Cys 


Ser 


Lys 

1 O A 

1 o U 


His 


Arg 


Arg 


He 


His 
195 


Cys 


Gly 


Lys 


Ala 


Phe 
zlO 


Arg 


He 


His 


Thr 


Gly 

A A cr 

225 


Lys 


Ala 


Phe 


Asn 


Gin 
240 


His 


Thr 


Glu 


Glu 


Lys 

o cr c 

zb5 


Phe 


Lys 


Gin 


Ala 


Ser 
z 70 


Gly 


Glu 


Lys 


Pro 


Tyr 

A O C 

285 


Gin 


Ser 


Ser 


His 


Leu 

•D 

3 00 


Asn 


Leu 


Tyr 


Lys 


Cys 
315 


Ser 


Asn 


Leu 


Thr 


Asn 
330 


Tyr 


Lys 


Cys 


Lys 


Glu 
345 


Leu 


Asn 


Lys 


Gin 


Glu 
360 


Cys 


Glu 


Glu 


Cys 


Asp 
375 



Pro 


Ala 


Ala 


Pro 


Ala 










15 


Thr 


Gly 


Ser 


Gly 


Ala 










30 


Ala 


Ala 


Pro 


Ala 


Gly 










45 


Leu 


Gly 


Thr 


Val 


Lys 










60 


He 


Asn 


Arg 


Asn 


Asp 










75 


Ala 


Glu 


Ala 


Ala 


Asn 










90 


Gly 


Ser 


Lys 


Tyr 


Ala 










105 


Arg 


Arg 


Arg 


Gly 


Pro 










120 


Ser 


Glu 


Ser 


Gly 


Glu 
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1 O n: 


Lys 


Asn 


Glu 


Gly 


Ser 
1 id n 


Arg 


Arg 


Pro 


Tyr 


Arg 

JL D D 


Arg 


Pro 


Tyr 


Gly 


Arg 
J. / u 


Gly 


Glu 


Val 


Met 


Glu 


Gly 


Arg 


Pro 


Val 


Arg 

o n n 
^ U U 


Phe 


Arg 


Arg 


Gly 


Pro 
z±o 


Asn 


Glu 


Glu 


Asp 


Lys 
o "a n 


ta-JLn 


Pro 


Pro 


Gin 


Arg 


Arg 


Arg 


Pro 


Glu 


Asn 
260 


Ala 


Ala 


Asp 


Pro 


Pro 
275 


Gin 


Gly 


Gly Ala 


Glu 










290 



<210> 11 

<211> 588 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> inisc_f eature 
<223> Incyte ID No: 

<400> 11 



Met 


Gly 


Leu 


Tyr 


Gly 


1 








5 


Met 


Thr 


Ser 


Glu 


Leu 
20 


Pro 


Gin 


Leu 


Thr 


Met 
35 


Gin 


Asn 


Ala 


His 


Gly 
50 


Asp 


Pro 


Asn 


Thr 


Thr 
65 


Asp 


Gly 


Lys 


Pro 


Pro 
80 


lie 


Asn 


Ser 


Ser 


Pro 

95 


Gin 


Trp 


lie 


Cys 


Asp 
110 


Gly 


Trp 


Lys 


Asn 


Ser 
125 


Phe 
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vj-Lil 


Ssr 


vajL 


Tlir 


Cys 




Leu 


Asn 


V a.X 


^jxy 


I'ys 


OCX irxie 


Vjxy 










215 










220 








225 


Arg 


Arg 


His 


His 


Leu 


Val 


Arg 


His 


Trp 


Leu 


Thr 


His 


Thr Gly 


Glu 










230 










235 








24a 


Lys 


Pro 


Phe 


Gin 


Cys 


Pro 


Arg 


Cys 


Glu 


Lys 


Ser 


Phe 


Gly Arg 


Lys 










245 










250 








255 


His 


His 


Leu 


Asp 


Arg 


His 


Leu 


Leu 


Thr 


His 


Gin 


Gly 


Gin Ser 


Pro 










260 










265 








270 


Arg 


Asn 


Ser 


Trp 


Asp 


Arg 


Gly 


Thr 


Ser 


Val 


Phe 









275 280 

<210> 15 
<211> 539 
<212> PRT 

<213> Homo sapiens 

<220> 

<221> inisc_f eature 

<223> Incyte ID No: 7399016CD1 

<400> 15 



Met 


Gly 


His 


Cys 


Arg 


Leu 


Cys 


His 


Gly 


Lys 


Phe 


Ser 


Ser Arg 


Ser 


1 








5 










10 








15 


Leu 


Arg 


Ser 


lie 


Ser 


Glu 


Arg 


Ala 


Pro 


Gly 


Ala 


Ser 


Met Glu 


Arg 










20 










25 








30 


Pro 


Ser 


Ala 


Glu 


Glu 


Arg 


Val 


Leu 


Val 


Arg 


Asp 


Phe 


Gin Arg 


Leu 










35 










40 








45 


Leu 


Gly 


Val 


Ala 


Val 


Arg 


Gin 


Asp 


Pro 


Thr 


Leu 


Ser 


Pro Phe 


Val 










50 










55 








60 


Cys 


Lys 


Ser 


Cys 


His 


Ala 


Gin 


Phe 


Tyr 


Gin 


Cys 


His 


Ser Leu 


Leu 










65 










70 








75 


Lys 


Ser 


Phe 


Leu 


Gin 


Arg 


Val 


Asn 


Ala 


Ser 


Pro 


Ala 


Gly Arg 


Arg 










80 










85 








90 


Lys 


Pro 


Cys 


Ala 


Lys 


Val 


Gly 


Ala 


Gin 


Pro 


Pro 


Thr 


Gly Ala 


Glu 










95 










100 








105 


Glu 


Gly 


Ala 


Cys 


Leu 


Val 


Asp 


Leu 


lie 


Thr 


Ser 


Ser 


Pro Gin 


Cys 










110 










115 








120 


Leu 


His 


Gly 


Leu 


Val 


Gly 


Trp 


Val 


His 


Gly 


His 


Ala 


Ala Ser 


Cys 










125 










130 








135 


Gly 


Ala 


Leu 


Pro 


His 


Leu 


Gin 


Arg 


Thr 


Leu 


Ser 


Ser 


Glu Tyr 


Cys 










140 










145 








150 


Gly 


Val 


lie 


Gin 


Val 


Val 


Trp 


Gly 


Cys 


Asp 


Gin 


Gly His Asp 


Tyr 










155 










160 








165 


Thr 


Met 


Asp 


Thr 


Ser 


Ser 


Ser 


Cys 


Lys 


Ala 


Phe 


Leu 


Leu Asp 


Ser 










170 










175 








180 


Ala 


Leu 


Ala 


Val 


Lys 


Trp 


Pro 


Trp 


Asp 


Lys 


Glu 


Thr 


Ala Pro 


Arg 
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185 190 195 



T< A1 1 

XJCLi 


JT-L O 


Gin 


His 






Trp 


7\ en 


jrx w 


vjx_Y 


Asp 


xxXd 


XT X \J 


CXl n 

OXXJ. 


± IIX 










90n 
^ \j \j 










9ns 

W J 










91 n 

^ X u 




m n 


Gly Arg 








XXJLX 


Pro 


V ClX 


ijXjr 


AT a 


Glu 


Thr 












jCi JL ^ 










220 










225 




XJti Li 








Asp 


VclX 


raXcL 


fIJl n 

VjXlX 


XT X 


JrX 


Ser 


Asp 


OtiX 












230 










235 










240 




V a.a. 


Gly 


Pro 


Arg 


Ser 


^jxy 


Din o 


irX 


Jrx O 


vuXIl 


Pro 


Ser 


Leu 


ir±. O 










*± J 










250 










255 


Xj61J. 




Arg 


Ala 


ir i O 


kj-Ly 


r'l -n 

VjXIx 


Leu 


vjXy 


VjX u. 


T 

Lys 


Gin 


Leu 


JrX w 


O t:X 










9 fin 










265 










270 






Ser 


Asp 


Asp 


Arg 


vax 


Lys 


Asp 


VjXU 


jrne 


Ser Asp 


T 

Leu 


ot=r 




















280 










285 




v?xy 


Asp 


Val 


Leu 


oer 


vjjXU. 


Asp 






Asp 


Lys 


Lys 


Gin 


TV 










1^ 2/ U 










9 Q R 










nn 

O \J VJ 


i-i_Lci 


j_ri 


Ser 


Ser 






Ser 


irXie 


LjX U. 


Jrx O 


Tyr 


Pro 


Glu 


Arg 


Lys 










-J w J 










O X u 










T1 R 

O X J 






Gly Lys 


Lys 


Ser 


^jX u. 


Ser 


Lys 


^X u. 


2-i.xa 


Lys 


Lys 


Ser 


VjXU 










o ^ u 










"^9 R 
o ^ 










O J u 


Vj-LU 


Pjto 


Arg 


lie 


Arg 


Lys 


Lys 


Pro 


r*'\ IT- 

VjrXy 


Pro 


Lys 


Pro 


Gly Trp 


Lys 










O O J 










J ^ u 










A R 
-3 ^ D 




Lys 


Leu 


Arg 


'^ys 


tjXU 


Arg 


virXU 


vjX u. 


Leu 


Pro 


Thr 


lie 


Tyr 


Lys 










R n 










R R 
ODD 










n 

-3 O U 






Tyr 


Gin 


r*i 17- 

JLy 


Vw^ys 


Txir 


i-iX d 


V ciX 


lyr 


Arg 


Gly Ala 


Asp 


Vjxy 










J VJ J 










7 n 














Lys 


Lys 


His 


JL jLe 


Lys 


VjrXU 


iixs 


rlXS 


VjXU 


LtXU 


Val 


Arg 


Glu 


Arg 










fin 
^ o u 










R R 










Q n 






Pro 


His 


Jr JL LJ 






7\ cm 


T 

Lys 


V clX 




Met 


lie 


Asp 


Arg 










J ^ Z) 










Ann 

T± U 










AOR 
w J 


m 




Gin 


Arg 






Lys 


Leu 


X X t= 


nx & 


- 

Txir 


Glu 


Val 


Arg 












X u 










A1 R 
^X 3 










A9 n 


m 

Tyr 


± jLe 


Cys 


Asp 




'^ys 


vy±y 


VjXXI 


Txir 


Pin Q 


Lys 


Gin Arg Lys 


XlXS 










id9R 










43 0 










A*^ R 
r± J J 






Val 


His 


VjJLIl 


jyiet- 


Arg 


rlxS 


Ser 


r*i -XT 

v:jXy 


AX a 


Lys 


Pro 


Leu 


vjxn 










A An 










AAR 










A R n 
^ 3 u 


Cys 




Val 


Cys 


^jjLy 


irne 


vjxn 


Cys 


Arg 


vjxn 


Arg 


Ala 


Ser 


Leu 


Lys 










^ O D 










A ^ n 
4fc o u 










/I R 
4t O D 


Tyir 


±11 S 


Met 


Thr 


Lys 


rilS 


Lys 


AX a 


VjXU 


Thr 


LrJLU 


Leu 


Asp 


Phe 


Axa 










470 










475 










480 


Cys 


Asp 


Gin 


Cys 


Gly 


Arg 


Arg 


Phe 


Glu 


Lys 


Ala 


His 


Asn 


Leu 


Asn 










485 










490 










495 


Val 


His 


Met 


Ser 


Met 


Val 


His 


Pro 


Leu 


Thr 


Gin 


Thr 


Gin 


Asp 


Lys 










500 










505 










510 


Ala 


Leu 


Pro 


Leu 


Glu 


Ala 


Glu 


Pro 


Pro 


Pro 


Gly 


Pro 


Pro 


Ser 


Pro 










515 










520 










525 


Ser 


Val 


Thr 


Thr 


Glu 


Gly 


Gin 


Ala 


Val 


Lys 


Pro 


Glu 


Pro 


Thr 





530 535 



<210> 16 

<211> 390 

<212> PRT 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 6996690CD1 



<400> 16 

Met Ala Glu He His Asn Gly Gly 

1 5 
Asn Gly Glu He Phe Ser Glu His 
20 

Gly Thr Glu Asn Thr Gly Asp Thr 



Glu Leu Cys Asp Phe Met Glu 
10 15 

Ser Cys Leu Asn Ala His Met 
25 30 

Tyr Asp Cys Asp Glu Tyr Gly 
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35 



Glu 


A en 


XT JLXc: 


Pro 




XltsU. 


nxto 












50 










Sex* 


V d _L 


Leu 




m n 

VjXXX 














65 








Asn 


Val 




Gin 


RD 


X IIX 


±xp 


Tl n 


Sex* 


Asp 




Glu 


O \J 

Glu 




PVlPk 
JTXltS 


Val 










95 








Asn 




J — L t- 


X IIJ- 


■R-i ci 

XJLJ- fc> 


A cm 














110 












Ajrg 




XT 


XllX 


±yr 


GoT~ 

OCX 










1 9 R 








J-t jr o 




ri-Lo 


± iix 


V CL J_ 


Vj J_ LL 














J. '± 










XT lies 


JrJuie 


Arg 


lyr 




Ser 


Tyr 










J- ^ J 












varJUy 


r^i n 
\jJ- u. 


Lys 


Pro 


Tyr 


vjXU. 










1 7 n 

X / L/ 










TJniT 


va± 


osr 


oer 


tlx o 


Leu 


vax 










X O D 












Lys 


Pro 


lyr 


OXXl 




Lys 










200 










Arg 




vjxy 


Leu 


TJnr 


Lys 


nx o 










91 s 








j_i_y o 


irJL w 


lyx 


m n 

\JJ. u. 




rVsn 


m n 

^ JL U. 












23 0 












J_ic LL 


J. xXJL 


KyJL LI 


inx D 


PVi 


Lys 










245 










V^T J- LL 






VCLX 


^y to 


vjxy 


Lys 










260 












Asn 


His 


Pino 


7\ "KTT 

Axy 


XXc^ 


nxo 










97 ^ 












Glu 


Cys 


vjx_Y 




aXCL 


PViia 










9 Q fl 








Asn 


XI J. o 


Val 


Lys 


XX6 


"Wt C! 

JtlXo 


XXlX 


'jxy 










O W J 








Asp 




Gly Lys 


raXo. 


XT lie: 


Al 

AX a. 


T'Vi V* 
X IIX 










320 








lie 


Arg 


Thr 


His 


Thr 


Gly 


Glu 


Lys 










335 








Gly 


Lys 


Thr 


Phe 


Arg 


Ala 


Ser 


Ser 










350 








lie 


His 


Thr 


Gly 


Glu 


Lys 


Pro 


Tyr 










365 








Ala 


Tyr. 


Asn 


Arg 


Phe 


Tyr 


Leu 


Leu 



380 



<210> 17 
<211> 807 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> mis cofeature 

<223> Incyte ID No: 7740866CD1 

<400> 17 

Met Lys Glu Trp Lys Ser Lys Met 

1 5 
Ser Ala Arg Ala Ala Ser Glu Lys 
20 





40 












Ser 


Ala 


Pro 


AX a 


el's -KT 

\3S.y 


VjXU 


Thr 




55 










D u 




Ala 


PVio 


OcX 


xjcll 


Jtrx \j 


JTX vj 




70 










/ O 


Gly Asp 


Lys 


OSX 


PVio 


\7X LL 


Tyr 




p 1=; 

C50 










Q n 


Asp 


vjXil 


vD v!^ J- 


rx_L o 


Leu 


Gin 


Ala 




inn 
X u u 










1 OR 

X U D 


X JLIX 


Leu 


Tyr 


Ol n 


VorXIX 


Lys 


Pin 




lie: 

xxd 










X Z U 


Txir 


oer 




A1 a 


V CL J. 


OCX. 


V a. J- 




T "3 n 
Xc> U 










XO 3 


Tyr 


orXU 


Cys 


Lys 


urXU 


Cys 


^jxy 














xo u 


Leu 


Asn 


oer 


XlXS 


1XL6 L. 


Arg 


Thr 




1 c n 
J.OU 










X 0 o 


Cys 


Lys 


VjX u. 


Cys 


VjXy 


Lys 


Cys 




1 "7 c 

± / o 










1 Q n 

X oU 


VjX u. 


XIX S 


VclX 


Arg 


Tl 
XX6 


XlXS 


Thr 




ion 










1 Q R 

xy o 


on II 

QirXU 


Cys 


vjxy 


Arg 


TV 1 ^ 

Axa 


13 v»^ 

pne 


Axa 




9 n c: 










Z X u 


T7sa "1 

vax 


Arg 


Tl 

X xe 


rixs 


Thr 


r^l ir 

vjxy 


VjXU 




o o n 
Z z U 










0 0 R 
z Z w> 


tuXy 


Lys 


A 1 


lyr 


AoXX 


ilX y 


PVio 
irllts 




O R 










0 /I n 
Z 4i u 


Txir 


XXXS 


Thr 


ijX U 


Ol -1 -1 
VjXU 


Lys 


Pro 




Z 3 U 










9 R R 
Z O 3 


Ser 


pne 


Arg 


OSX 


oex 


osx 






o c c: 

Z OD 










97 n 
z / u 


- 

Tnr 


ijxy 


Tl Ci 

xxe 


Lys 


Pro 


m 

Tyr 


Lys 




o Q n 
Z o u 










9 Q R 
Z oD 


Thr 


vax 


OCX 


OcX 


Gqi" 

ot=xr 


Leu 


nx fa 




295 










n n 
J u u 


Glu 


Lys 


Pro 


Tyr 


r^i 1 1 

VjXU 


Cys 


Lys 




310 










^X3 


Ser 


Ser 


IjXIl 


Leu 


Tl » 
xxe 


r^l n 

\jXU 


XlXS 




325 










^5 n 
0 0 u 


Pro 


Tyr 


Tl 

xxe 


Cys 


Lys 


VjXU 


Cys 














345 


His 


Leu 


Gin 


Lys 


His 


Val 


Arg 




355 










360 


lie 


Cys 


Asn 


Glu 


Cys 


Gly 


Lys 




370 










375 


Thr 


Lys 


His 


Leu 


Lys 


Thr 


His 




385 










*3 Q n 
0 y u 


Glu 


He 


Ser 


Glu 


Glu 


Lys 


Lys 




10 










15 


Leu 


Gin 


Arg 


Gin 


He 


Thr 


Gin 



25 30 
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Glu 


Cys 


Glu 


Leu 


Val 


m n 

VjX Li 


Thr 


o cx 


Asn 


Ser 


Glu 


Asp 




Leu 


Leu 




















40 










45 


Jjys 


His 


J. X p 


Val 


Seir 


Pro 


Leu 


Lys 


Asp 


Ala 


Met 


Arg 


His 


Leu 


Pro 










50 










55 










60 


Ssir 


Gin 


Glu 


Sear 


Gly 


He 


fxx y 


Glu 


Met 


His 


He 


He 


Pro 


Gin 


Lys 










yj 'J 










70 










75 


Ala 


He 


Val 


Gly» 


Glu 


Tie* 

X X V- 


VJX Jf 


His 


Gly 


Cys 


Asn 


Glu 


Gly 


Glu 


Lys 




















85 










QD 

\j 


lie 


Leu 


J. 


Ala 


v3 J. Jf 


vjX LL 


OCX 


OCX 


His 


jC-ij- y 




Glu 


Val 


Ser 


Gly 










95 










100 










105 


Gin 


Asn 


Phe 


Lys 


Gin 


XI Jf o 


Ser 


Ol V 
vjx _y 


Leu 


Thr 


Glu 


His 


Gin 


Lys 


He 










1 1 n 

_L J- W 










X X 










19 0 
X z< w 


XX u. o 




J- c 




T AT'C* 


T'Trr* 

X XlX 


xyx 


ox LL 


V— 'jf to 


XJ Jf to 


V7X LL 


Cys 


m 11 

wX LL 


XI Jf to 


X XXX 










J. ^ -J 










X ^ \J 










1 R 

X J 


Phe 


Asn 


"J- y 


Ser 




X^OIX 


Leu 


He 


He 


His 


Gin 


jtxx y 


He 


His 


Thr 










1 AO 

JL ^± L/ 










X ^ J 










1 RD 

X 3 u 




7\ cm 




ir j_ 




V CL X 


Jf to 


XT.toli 


^jjX U. 


to 


X _y 


XJ_y to 




OCX 


A CTI 
X-^tolX 










J — J -J 










160 










1 fiR 

X O J 


Gin 


SeiT 


Seir 


Asn 


Leu 


He 


He 


His 


Gin 


Arg 


He 


His 


Thr 


Gly 


Lys 










J. / L/ 










X / _/ 










1 ftn 

X O L/ 


Tat* CI 


jr J- (J 


j.yx 


J- J. C2 


^ Jf O 


Hi c; 
jnx o 


Glu 


V-fJf to 


m V 


Xi Jf to 




Phe 


A csn 


Gin 


OCX 










1 ft R 

_L O J 










X J7 VJ 










1 QR 

X 7 J 




Asn 


Leu 


Val 


jt-ix. y 


His 


Lys 


Gin 


He 


His 


Ser 


Gly 


Gly 


Asn 


Pro 










W W 










9 ns 










91 n 

Zj X w 




Glu 


Jf o 


Lys 


Glu 




Glv 
J- jf 


Lys 


Ala 


Phe 


Lys 


Glv 

VJX Jf 


Ser 


Ser 


Asn 










91 R 

4C> X «J 










99 0 

/L, /U \J 










9 9 R 

Zi ^ 




V dJL 


Leu 


XXJL O 


Gin 


jtix y 


He 


JTLX to 


OCX 


■"■X y 


f^l V 
v?X Jf 


Lys 


irx w 


xyx 


XJC LL 










9'^ 0 
^ •J \j 










9'^ R 










9dn 

^ ^ L/ 


Cys 


Asn 


Lys 


Cys 


Gly 


Lys 


Ala 


Phe 


Ser 


Gin 


Ser 


Thr 


Asp 


Leu 


He 










245 










250 










255 


He 


His 


His 




He 


His 


Thr 


m V 


Glu 


Lys 


Pro 


xyx 


Glu 


Cys 


-lyx 










260 










2 65 










9 7 0 


Asp 


Cys 


Gly 


Gin 


Met 


Phe 


Ser 


Gin 


Ser 


Ser 


His 


Leu 


Val 


Pro 


His 










97 S 










9 ft 0 

^ O VJ 










9 ft R 


Gin 


j£-s.x. y 


He 


Hi =5 
XJ.X o 




vjx_y 


Glu 


xj_y to 


JTX W 


Leu 


xi_y to 


v_.__y to 




Glu 


to 










9Qn 










9 Q R 










O L/ Lf 


w -1- U 




Ala 


xrxxc? 


xox y 


Gin 


His 


OCX 


Hies 


Leu 


Thr 


Glu 


Hi C5 

XxX to 


\7XXX 


AX y 










O U w} 










O X L/ 










"^1 R 

O X 3 


J_}C: LL 




tDcX. 


var J,y 


VjT X LI 


xi_y to 


irx Kj 


lyr 


(X\ n 

virx LL 


v^_y to 


Hi cj 
jnx to 


ax y 


Pi res 
V— y to 


m V 

VjXjf 


Lys 










^9 0 

O ^ L* 










"^9 S 

o ^ «J 










O O L/ 




XTIXCS 




vir jL_y 


Airg 


X XXX 


Ala 


Phe 


Leu 


Lys 


Hi ca 

XXX to 


Gin 




Leu 


Hi Q 
XxX to 










o o o 










A n 

J f± u 










/I R 

J 41: 3 




Vjiy 


vj JL U. 


T.I res 


XX fci 


LtX U. 


vjX LL 




VjX LI 


Lys 


J-Xlx 


T>ViO 

irXxc 


OCX 


Lys 


i^top 










*3 n 

J 3 U 










R R 
.J} 3 3 










"5 n 
o o u 


JL LL 


vjx u. 




Airy 


VaX LL 


vrrX LL 


V7XXX 


Arg 


X xc 


HH c3 

jnx to 


fil n 

V7X1X 


VirX LL 


VjX LL 


XJjf to 


AT « 










O O 










J / u 










"^7 R 

O / 3 


xyi: 


J. JL^ 




Asn 


o^xxx 


V^Jy to 


vrr X _y 


/-ix y 




XrXXC 


virXXl 


VjX^ 


X XXX 


OCX 


Asp 










ft n 

o o u 










ft R 
O O J 










-J _? u 


Leu. 


He 


"J- y 


His 


Gin 


Val 


Thr 


His 


Thr 


Gly 


Glu 


Lys 


Pro 


X y X 


Glu 










O _7 J 










*± \J L/ 










A n R 

^ VJ 3 




xj_y o 


Glu 




Gly 


Lys 


Thr 


Phe 


Asn 


Gin 


Ser 


OCX 




Leu 


Leu 










*± X w 










4.1 R 

f± X J 










A9 n 

f± LI 




Hi ea 
XI. J. o 


His 




He 


His 


Ser 


Gly 


Glu 


Lys 


JTX L^ 


\^ Jf to 


V Ct X 


Cys 


Cloy 

OCX 










A9 R 




















A'^ R 

*i J .J 






VjJ-jr 


- 

Lys 


OCX 


p"h<=» 

irxxc 


rxxy 


CX\ V 


OCX 


OCX 




XJCIX 


X X c 


Jr\x, y 


Hn c? 

XXX to 




















y: w> 










ARO 

*± 3 L/ 


His 


Arg 


Val 


His 


ThiT 


Glv 


Glu 


Lvs 


Pro 


Tvr 


Glu 


Cys 


Ser 


Glu 


Cys 










455 










460 










465 


Gly 


Lys 


Ala 


Phe 


Ser 


Gin 


Arg 


Ser 


His 


Leu 


Val 


Thr 


His 


Gin 


Lys 










470 










475 










480 


He 


His 


Thr 


Gly 


Glu 


Lys 


Pro 


Tyr 


Gin 


Cys 


Thr 


Glu 


Cys 


Gly 


Lys 










485 










490 










495 


Ala 


Phe 


Arg 


Arg 


Arg 


Ser 


Leu 


Leu 


He 


Gin 


His 


Arg 


Arg 


He 


His 



24/78 
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R n n 

-J u u 










R n R 

KJ Zj 










R1 n 

w) X u 




\3±y 




T ,\TCi 




iyr 


m n 

virx Li 


P\7'C! 




Pi n 

ox Lt 




Gly Lys 


Leu 


Phe 










J X D 










R9 n 










525 


J. -L C 




7\ 




Ala 


Xrllcs 


XI LJ. 


Lys 


His 


Gin 


Ser 


XJC LL 


XXX 0 


Thr 


Gly 










^ J \j 










ij ^ ^ 










RAD 




Lys 




vj JL m 


P\r ci 


ml n 
VjXU 




± XXX 


Pin fa 


OCX 


Pi n 

VTXXX 




Glu 


Glu 


Leu 










R A R 

3 ^i: ^ 










R R n 

^ -J L/ 










RR R 
J J J 






P1 11 


Gin 




X X t: 


XiX 0 


C^l n 
vjxxi 


Pi n 


Ala 


XI 0 


Ala 


±yj- 


X XJJ 


Cys 










J o u 










R R 
J 0 -J 










R7n 




nil n 




VJX J/' 


Arg 


iT.X ct 


Jriits 


Pi -n 


Pi V 


OCX 


OCX 




Leu 


lie 


Ar*rr 
y 










J / -J 










R R n 

u 0 \J 










RRR 
J 0 J 






V cxX 


JL XIX 


TT-i CI 
XXX o 


± lix 


Arg 


Pi n 

V^X LL 


T 

Lys 


P"r*r^ 
Jrx *J 




Pi n 
VjX LL 


Pt7"C! 

V— y fc> 


Tat'ci 
j-iy 0 


Pi TJ 
OX LL 










O J u 










R Q R 
O 27 3 










D UU 




m 17- 
o^-Ly 


Lys 


Thr 






V7XXX 


OCX 


OCX 


As^ 


Leu 


Leu 


Arg 


XXX 0 


XXX 0 










OR 
O UD 










D X U 










0 xo 






His 


Ser 


PT -IT- 

vjxy 


PI n 


T 

Lys 


Pro 


Tyr 


V ax 


v^ys 


Asn 


Lys 


Cys 


Pi TT- 










o z u 










<^ 9 R 

0 z 0 










D J U 


T 

Lys 




Phe 


Arg 


Pi -VT- 
VjX_^ 


OCX 


Q V 




xjcu. 


Tl <=» 
X X c 


ijy 0 


JtlXS 


TTt cs 

JtXXS 


«x y 


Tl (= 
X X c 










O O 3 










AO 

0 ^ 










AR 


JtlXS 




Gly Glu 


Lys 


Pro 


Tyr 


Pi n 
oXU 


Ptro 

v^ys 


OCX 


Pi n 


^ys 


Pi \r 


Lys 


23^1 55 
x-iXo. 










oo u 










R R 
0 3 0 










D 0 U 






Gin 


Arg 


Oc^X 


XlXo 


Leu 


Zil a 


Thr 


His 


Pi -n 

V7XXX 


Lys 


lie 


XXX 0 


X XXX 










ODD 










670 










^^7 R 




VjJ.ll 


Lys 


Pro 


Tyr 


Pi -n 


Cys 




Glu 


Cys 


Pi -^r 

vjxy 


Asn 


Ala 


±r lie; 


Arg 










^ Q rv 

D O U 










685 










Q n 

D U 


Airg 




Ser 


Leu 


Leu 


Tl *=» 

X xe 


Pi n 


PT-i c: 

nx to 


Arg 


Arg 


Xic LL 


His 


Ser 


Pi -^7 


Pi n 
vtX LL 










^ Q R 
D D 










700 










7 n R 




P3ro 


Tyr 


Glu 


V— ^ to 


Lys 


Pi 

VjX u. 




Gly Lys 


XIC LL 


Phe 


Met 


xxp 


XXXio 










/ X u 










715 










790 


JL XIX 


AX a. 


Phe 


Leu 


Lys 


XIX 0 


(Til n 

vsXXX 


Arg 


Leu 


His 


Al ^ 
aXCL 


Gly Glu 




Leu 










■79 c 










730 










7*^ R 


ril n 

U. 




Cys 


Glu 


J-j_Y 0 


X XXX 


It XXtz: 


OCX 


Lys 


Asp 


Pi n 

J- LL 


Glu 


Leu 


raX y 


T i\7^ci 
XJ_y sd 










7 A n 










745 










7 R n 


^•1 n 
J_ U. 


k^xri 


Arg 


Thr 


XIX 0 


Pi n 
v^xxx 


Pi n 

V7X Ll 


T 

Lys 


Lys 


Val 


lyr 


Trp 


Cys 


Asn 


PI -n 










"7 R R 










760 










7 <^ R 


Cys 


Ser 


Arg 


Thr 


Phe 


Gin 


Gly 


Ser 


Ser 


Asp 


Leu 


lie 


Arg 


His 


Gin 










770 










775 










780 


Val 


Thr 


His 


Thr 


Arg 


Glu 


Lys 


Pro 


Tyr 


Glu 


Cys 


Lys 


Glu 


Cys 


Gly 










785 










790 










795 


Lys 


Thr 


Gin 


Ser 


Glu 


Leu 


Arg 


Pro 


Ser 


Glu 


Thr 


Ser 
















800 










805 













<210> 18 

<211> 290 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 8181605CD1 

<400> 18 



Met 


Gly Glu 


Leu 


Ser 


Pro 


Ala 


Val 


Ala 


Gin 


Glu 


Glu 


Thr 


Pro 


Pro 


1 








5 










10 










15 


Gly Asp 


TiTP 


Leu 


Phe 


Gly 


Gly 


Val 


Arg 


Trp 


Gly 


Trp 


Asn 


Phe 


Arg 










20 










25 










30 


Cys 


Lys 


Pro 


Pro 


Val 


Gly 


Leu 


Asn 


Pro 


Arg 


Thr 


Gly 


Pro 


Glu 


Gly 










35 










40 










45 


Leu 


Pro 


Tyr 


Ser 


Ser 


Pro 


Asp 


Asn 


Gly 


Glu 


Ala 


He 


Leu 


Asp 


Pro 










50 










55 










60 


Ser 


Gin 


Ala 


Pro 


Arg 


Pro 


Phe 


Asn 


Glu 


Pro 


Cys 


Lys 


Tyr 


Pro 


Gly 










65 










70 










75 


Arg 


Thr 


Lys 


Gly 


Phe 


Gly 


His 


Lys 


Pro 


Gly 


Leu 


Lys 


Lys 


His 


Pro 
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ft n 

o u 


Ala 


Ala 


Pro 


Pro 


Gly 


Lys 


Ser 


Phe 


Gin 


Leu 
1 1 n 

-U J- \J 


Cys 


Gly 


Ala 


Pro 


Asp 

125 


Ser 


Gly 


Ser 


Gly 


Gly 
140 


Arg 


Asp 


Gly 


Ser 


Ala 

_L J ^ 


Thr 


Arg 


Pro 


Ala 


His 
170 

JL / VJ 


Glu 


Arg 


Pro 


Phe 


Pro 

JL O J 


Arg 


Ser 


Lys 


Leu 


lie 


Pro 


Phe 


Thr 


Cys 


Thr 

91 S 

^ JL 


His 


Leu 


Arg 


Lys 


His 

9"^ n 
^ o u 


ir J- yJ 




Arg 


Gly 




Pro 


Phe 


Lys 


Ser 


Pro 
260 


Leu 


Val 


Thr 


Asp 


Trp 
275 


Asp 


Gly 


Gly Asp 


Met 










290 



<210> 19 

<211> 452 
<212> PRT 

<213> Homo sapiens 

<220> 

<221> inisc__f eature 
<223> Incyte ID No: 

<400> 19 



Met 


Lys 


Gly 


His 


Glu 


1 








5 


Ala 


Glu 


Arg 


Phe 


Pro 
20 


Ser 


His 


Phe 


Glu 


Pro 
35 


Cys 


Glu 


Lys 


Thr 


Phe 

50 


Arg 


Ala 


His 


Phe 


Arg 
65 


Gly 


Cys 


Ser 


Lys 


Gin 
80 


Leu 


Arg 


Ser 


His 


Thr 
95 


Ser 


Cys 


Gly 


Trp 


Thr 
110 


Arg 


Arg 


Lys 


His 


Asp 
125 


Gly 


Cys 


Gly 


Lys 


Ser 
140 


Ser 


lie 


Thr 


His 


Leu 
155 


Gly 


Cys 


Cys 


Ala 


Arg 











O J 


Gly 


Arg 


Pro 


Phe 


Thr 
100 


Gin 


Val 


Ser 


Leu 


Ser 
115 


Gly 


Ser 


Gly 


Pro 


Gly 

1 n 


Gly 


Gly 


Gly 


Gly 


Ser 
145 


Leu 


Arg 


Cys 


Gly 


Glu 
1 fin 


Leu 


lie 


Arg 


His 


Arg 

1 *7R 
J. / 3 


Cys 


Thr 


Glu 


Cys 


Glu 
1 Qn 


Asp 


His 


Tyr 


Arg 


Thr 

9 OR 


Val 


Cys 


Gly 


Lys 


Ser 

9 9 0 


Gin 


Arg 


Asn 


His 


Ala 

9 "5 R 


Pro 


Leu 


Pro 


Thr 


Pro 

250 


Ala 


Ser 


Lys 


Gly 


Pro 
265 


Thr 


Cys 


Gly 


Leu 


Ser 
280 



8266487CD1 



Gin 


Glu 


Ser 


Leu 


Phe 










10 


Thr 


His 


Ala 


Lys 


Leu 










25 


Glu 


Arg 


Pro 


Tyr 


Lys 










40 


lie 


Thr 


Val 


Ser 


Ala 










55 


Glu 


Gin 


Glu 


Leu 


Phe 










70 


Tyr 


Asp 


Lys 


Ala 


Cys 
85 


Gly 


Glu 


Arg 


Pro 


Phe 










100 


Phe 


Thr 


Ser 


Met 


Ser 










115 


Asp 


Asp 


Arg 


Arg 


Phe 










130 


Phe 


Thr 


Arg 


Ala 


Glu 










145 


Gly 


Thr 


Lys 


Pro 


Phe 










160 


Phe 


Ser 


Ala 


Arg 


Ser 











90 


Cys 


Ala 


Thr 


Cys 


Gly 
105 


Ala 


His 


Gin 


Arg 


Ser 
120 


Thr 


Gly 


Gly 


Gly 


Gly 
135 


Gly 


Gly 


Gly 


Ser 


Ala 
150 


Cys 


Gly 


Arg 


Cys 


Phe 
165 


Met 


Leu 


His 


Thr 


Gly 
1 fin 


Lys 


Arg 


Phe 


Thr 


Glu 
1 

_L -7 «J 


His 


Thr 


Gly 


Val 


Arg 

91 n 
^ J- \j 


Phe 


lie 


Arg 


Lys 


Asp 
99 R 

zli Zj ~J 


Ala 


Gly 


Ala 


Lys 


Thr 
94n 

^ u 


Pro 


Ala 


Pro 


Pro 


Asp 

255 


Leu 


Ala 


Ser 


Thr 


Asp 
270 


Val 


Leu 


Gly 


Pro 


Thr 
285 



Lys 


Cys 


Glu 


Val 


Cys 
15 


Ser 


Ser 


His 


Gin 


Arg 
30 


Cys 


Asp 


Phe 

r 


Pro 


Gly 
45 


Leu 


Phe 


Ser 


His 


Asn 
60 


Ser 


Cys 


Ser 


Phe 


Pro 
75 


Arg 


Leu 


Lys 


lie 


His 
90 


lie 


Cys 


Asp 


Ser 


Asp 
105 


Lys 


Leu 


Leu 


Arg 


His 
120 


Thr 


Cys 


Pro 


Val 


Glu 
135 


His 


Leu 


Lys 


Gly 


His 
150 


Glu 


Cys 


Pro 


Val 


Glu 
165 


Ser 


Leu 


Tyr 


lie 


His 
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1 7 n 


Ser 


Lys 


Lys 


His 


Val 

J- O -J 


Pro 


Val 


Ser 


Thr 


Cys 
200 


Lys 


Ala 


His 


Met 


Val 


Pro 


Gin 


Leu 


Glu 


Ala 
23 0 


Ser 


Ser 


Pro 


Gly 


Gin 

9 A S 


Leu 


Phe 


Ser 


Asp 


Thr 
9 fin 


Ser 


Asp 


Glu 


Ala 


Leu 


Ser 


Val 


Ser 


Ser 


Ser 
9 Q n 

Z J? u 


Ser 


Leu 


Gly 


Pro 


Met 


lie 


Pro 


Pro 


Ser 


Leu 


Thr 


Val 


Leu 


Gin 


Gin 

R 

3 3 D 


Val 


Ser 


Ala 


Gly 


Ala 
R n 

0 0 u 


Asn 


Leu 


Ser 


Asp 


Asp 

fi R 
J D J 


Ala 


Ala 


His 


lie 


Thr 
0 0 u 


Asn 


Ala 


Ser 


Val 


Pro 

0 ^ -J 


Asp 


oer 


Pro 


Ser 


ArQ* 
410 


His 


Gly 


Leu 


Pro 


Gin 

425 


Gly 


Ala 


Gin 


Asp 


Thr 
440 


Leu 


Val 









<210> 20 
<211> 259 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 
<223> Incyte ID No: 

<400> 20 



Met 


Ala 


Ser 


Pro 


Gin 


1 








5 


Arg 


Asn 


Gin 


Leu 


Gin 










20 


Glu 


Glu 


Glu 


Val 


Arg 










35 


Val 


Cys 


Cys 


Leu 


Cys 










50 


Thr 


Glu 


Cys 


Leu 


His 










65 


Leu 


Gin 


Thr 


Ser 


Lys 










80 


Glu 


Thr 


Gin 


Pro 


Leu 











1 7R 


Gin 


Asp 


Val 


Gly 


Ala 
1 QD 


Asn 


Arg 


Leu 


Phe 


Thr 
9nR 


Arg 


Gin 


His 


Ser 


Arg 

9 9 0 


Pro 


Ser 


Ser 


Leu 


Thr 

9*^ R 

3 ^ 




Vj7 J. U, 


Leu 


Thr 


9 R 0 


Pro 


Ala 


Asn 


Ala 


Ser 

Z 0 D 


Asn 


Ser 


Gly 


He 


Leu 

0 Q n 
z 0 u 


Leu 


Gly 


Gly Asn 


Leu 










9 Q R 

Z J? 3 


Glu 


Pro 


Leu 


Val 


Leu 
0 ± u 


Asp 


Ser 


Pro 


Leu 


Val 

"^9 R 
0 Z O 


Gly 


Ser 


Phe 


Ser 


Val 

"5 /I n 

J 4t u 


Leu 


Gly 


Cys 


Leu 


Val 

•5 R R 
ODD 


Pro 


Leu 


Ala 


Leu 


Thr 


Thr 


Pro 


Thr 


Ser 


Ser 

RR 


Glu 


Leu 


Leu 


Ala 


Pro 
/inn 


Pro 


Gly 


Ala 


Val 


Gly 
415 


Ser 


Thr 


Leu 


Pro 


Ser 
430 


Glu 


Leu 


Ser 


Ala 


Gly 
445 



5552784CD1 



Gly 


Gly 


Gin 


He 


Ala 
10 


Ser 


Val 


Tyr 


Lys 


Met 
25 


Val 


Lys 


He 


Lys 


Asp 
40 


Ala 


Gly 


Tyr 


Phe 


Val 
55 


Thr 


Phe 


Cys 


Lys 


Ser 
70 


Tyr 


Cys 


Pro 


Met 


Cys 
85 


Leu 


Asn 


Leu 


Lys 


Leu 



27/7 



180 





Lys 


OCJL 


7\ -v-rT 












1 QR 

X -7 wJ 


OCfJL 


ij_y to 




OCX 












210 


Arg 


Cll n 


_ 

ASp 


T 


Leu 










22 5 


ir X 


OfcJX 


0 v5X 


ox u. 


Xj ts u. 










940 

^ ~b W 




Asp 


T 

Leu 


-rt.X ct 


-rt.X ci 










9 RR 
z -J J 


vjJLy 


Ser 


AX a 


-vr 
VnXy 


vjjxy 










97 0 


Tiir 


He 


Asp 














9 RR 


Pro 


Ala 


Asn 


Asn 


oSx 










n n 
J u u 


V aj. 


Ala 


His 


oer 












0 X D 


T 

Leu 


Gly Thr 


jBl.XcI 


AX a. 










"5 T n 


Asp 


Asp 


Val 


vjxn 


Thr 












A J- a 


Leu 


Pro 


iYLeu. 


Lys 










fi 0 
0 0 u 


oex 


Asn 


Ser 




Xieu 










"^7 R 

J / D 


Ser 


Thr 


Pro 


Arg 


VjX Li 










J ^ \j 


He 


Lys 


Val 


Glu 


Pro 










405 


Gin 


Gin 


Glu 


Gly 


Ser 










420 


Pro 


Ala 


Glu 


Gin 


His 










435 


Thr 


Gly Asn 


Phe 


Tyr 










450 


X-Le 


Ala 


Met 


Arg 


Leu 










1 R 

X >J 


Asp 


Pro 


Leu 


Arg 


Asn 










0 


Leu 


Asn 


Glu 


jlxS 


xxe 










45 


Asp 


Ala 


Thr 


Thr 


He 










60 


Cys 


He 


Val 


Lys 


Tyr 










75 


Asn 


He 


Lys 


He 


His 










90 


Asp 


Arg 


Val 


Met 


Gin 
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-J 


Asp 


lie 


Val 


Tyr 


Lys 

J — L W 


Lys 


Arg 


Xle 


Arg 


Glu 
125 


Thr 


Gin 


Pro 


Thr 


Gly 

1 do 

_L y: VJ 


Pro 


Phe 


Ser 


Ser 


Phe 
155 


Asp 


Glu 


Gin 


Leu 


Asn 

1 70 


Asp 


Lys 


Asn 


Lys 


Ser 

J. o ^ 


Val 


Arg 


Ala 


Glu 


Val 


Lsu 


Met 


Leu 


Asn 


Pro 
215 


Val 


Leu 


Pro 


Asp 


His 
230 


Trp 


Phe 


Gly 


Lys 


Pro 
245 


Glu 


Lys 


Arg 


Arg 





<210> 21 
<211> 665 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc„f eature 
<223> Incyte ID No: 

<400> 21 



Met 


Ala 


Ala 


Gin 


Met 


1 








5 


Phe 


Pro 


Ser 


Pro 


Leu 
20 


Ser 


Pro 


Gly 


Gin 


Cys 
35 


Gly 


Pro 


Arg 


Gly 


Ala 
50 


Trp 


Leu 


Met 


Pro 


Glu 
65 


Leu 


Val 


Leu 


Glu 


Gin 
80 


Ala 


Tyr 


Thr 


Gin 


Glu 

95 


Ala 


Leu 


Ala 


Glu 


Arg 
110 


Gin 


Met 


Ser 


Gly 


Gly 
125 


Pro 


Gin 


Glu 


Glu 


Leu 
140 


Glu 


Ala 


Pro 


Leu 


Gly 
155 


Arg 


Glu 


Met 


Glu 


Ser 
170 


Glu 


Glu 


Gly 


Gin 


Val 
185 


Leu 


Ser 


Glu 


Gly 


Ala 
200 


Ser 


Thr 


Glu 


Val 


Pro 











100 


Leu 


Val 


Pro 


Gly Leu 










115 


Phe 


Tyr 


Gin 


Ser 


Arg 
130 


Glu 


Glu 


Pro 


Ala 


Leu 

145 


Asp 


His 


Ser 


Lys 


Ala 
1 fiO 

-LOU 


Leu 


Cys 


Leu 


Glu 


Arg 

1 7^^ 


Val 


Leu 


Gin 


Asn 


Lys 
1 on 


Arg 


His 


Leu 


Arg 


Arg 
205 


Gin 


His 


Val 


Gin 


Leu 
220 


Met 


Thr 


Met 


Lys 


Gin 
235 


Ser 


Pro 


Leu 


Leu 


Leu 
250 



728123 OCDl 



Ser 


Glu 


Ala 


Ser 


Ala 
10 


Glu 


Leu 


Met 


Val 


Gly 
25 


Phe 


Trp 


Gly 


Phe 


Cys 
40 


Leu 


Ala 


Gin 


Leu 


Arg 
55 


Ala 


Cys 


Ser 


Lys 


Glu 
70 


Leu 


Leu 


Gly 


Thr 


Leu 
85 


Gin 


Trp 


Leu 


Gly 


Ser 
100 


Leu 


Gin 


Gin 


Glu 


Ser 
115 


Trp 


Ser 


Gly 


Gly 


Trp 
130 


Val 


Pro 


Arg 


Thr 


Glu 
145 


Pro 


Phe 


Gin 


Ala 


Pro 
160 


Pro 


Arg 


Gly 


Trp 


Thr 
175 


Leu 


Cys 


Asn 


Val 


Lys 
190 


Val 


Ser 


Gly 


Gly Trp 










205 


Arg 


Glu 


Ala 


Gly Asp 



28/7 











1 OR 

X U J 


Gin 


Asp 


Ser 


Glu 


Glu 










120 


Gly 


Leu 


Asp 




Val 










1 "^R 

X >,J ^ 


Ser 


Asn 


Leu 


Glv 


Leu 










150 


Hi c; 

XJ._L »D 


±yx 




-t\j-. y 


j.yx 










X VJ «J 


Leu 




O ti-L 


vj_L_y 


J— l_Y O 










1 RO 

X O L/ 




V O. -L 




V-» Jf o 


Ser 










X J7 >J 


V Ct JL 




Jf S3 


His 


ax y 










210 


Leu 


Phe 


Asp 


Asn 


Glu 










225 


lie 


Trp 


Leu 


Ser 


Arg 










240 


Gin 


Tyr 


Ser 


Val 


Lys 










255 


Jut; u. 


Z^-L CJ. 


Pro 


Gin 


Val 










1 R 

X J 


VjJLU 


Pro 


Ser 


Ser 


Lys 












Tyr 


IjiU 


Lys 


/\-La. 


Til i5 

AX a 










AR 


V:r±U 


Leu 


Cys 


Cys 


r^T -r-v 

tjj.n 










0 






Leu 


VT-L U. 


Leu 










7 n; 


Leu 


Pro 


vjjX U. 




VjrXXl 










Q n 
y u 


Pro 




LI 




± llx 










1 OR 

X U -J 






Pro 


Gly 


Leu 










1 90 


T/a 1 
Vci-L 


Pro 


Ala 


Pro 


Arg 










X J D 


vjj_L U. 


vj-Ly 


Glu 


Glu 


urXlX 












Pro 


Pro 


Gly His 


Arg 










J- O D 


Leu 


Gin 


Val 


Ala 


Pro 










180 


Thr 


Ala 


Thr 


Arg 


Gly 










195 


Gly 


Ala 
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Ajtq" 


Cys 




Tyr 


V ax 
125 


vax 


vaxy 


Arg 


r^i It 

VjXU 


Arg 
130 


Thr 


r*i 11 

VaXU 


Leu 


a 1 a 


Arg 
135 




1j611 


Asn. 


Leu 


Oc3X 

140 


VjXXl 


rn"U, 


m n 

IjXIl 


\7a 1 


Lys 
145 


T7a n 


Trp 


jrxxe 


vjixn 


Asn 
150 


Aurg 


Aurg 




Lys 


VjrXn 

155 


Lys 


Lys 


Asp 


vjxn 


vuxy 
160 


Lys 


Asp 


Ser 


r*i 11 

oXU 


Leu 

165 


Arg 




va± 


v ax 


oer 
170 


VjXU 


Thr 


i\xa 


Axa 


Thr 
175 


Cys 


oer 


vax 


Leu 


Arg 
180 




Lsu. 


\j_L U. 


VjXii 


vjxy 

IRS 
X 0 3 


Arg 


Leu 


Leu 


Ser 


Pro 
190 


Pro 


oxy 


Leu 


Pro 


a 1 a 

1 QS 




T 


Jr X. \J 


±r J_ 


200 




J. XIX 




Ala 


Leu 
205 




0C7X 


ai 


Leu 


Arg 
210 




PUTO 




T .01 1 
J-J ti U. 


Jr X. *J 

215 


AT 


Xltr U. 


vjxy 


Ala 


Gly 
220 


al a 


ai a 


ai a 


nl 

ijxy 


oex 
225 


an a 






AX a 


/iXo. 

230 


an a 

aXo. 


nxa 


a 1 a 
/\xa 


Pro 


Gly 
235 


Pro 


ai a 

r\X cl 


Pi -xr 

wxy 


a 1 a 

AX a 


ai a 
AX a 

240 


OSi 


±rX. \J 


■pT-i cs 


Pro 


0 T- 


jTiXCt 


Va.X 


vjxy 


Gly Ala 


IT X 


vjxy 


JrXO 


Pi \r 
vrXy 


Pro 










245 










250 










255 


i-ixo. 


vj±y 


PJTO 


vrxy 


m -vr 

^jxy 
260 


Leu 


xlXS 


ai a 
Axa 


Cys 


Ala 
265 


Pro 


ai a 
/1.x a. 


ai a 

riXa 


r^l -vr 

vjxy 


XlXS 

270 




Leu 


jrne 


Ser 


Leu 

275 


Pro 


vax 


Pro 


Ser 


Leu 

280 


Leu 


vjxy 


Ser 


vax 


AX a 

285 


Ser 


Arg 


Leu 


Ser 


Ser 
290 


Ala 


Pro 


Leu 


Thr 


Met 
295 


Ala 


Gly 


Ser 


Leu 


Ala 
300 


Gly 


Asn 


Leu 


Gin 


Glu 
305 


Leu 


Ser 


Ala 


Arg 


Tyr 
310 


Leu 


Ser 


Ser 


Ser 


Ala 
315 


Phe 


Glu 


Pro 


Tyr 


Ser 
320 


Arg 


Thr 


Asn 


Asn 


Lys 
325 


Glu 


Gly 


Ala 


Glu 


Lys 
330 



Lys Ala Leu Asp 



<210> 26 

<211> 262 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 3356640CD1 



<400> 26 



Met 


Lys 


Arg 


His 


Glu 


Met 


Val 


Val 


Ala 


Lys 


His 


Ser 


Ala 


Leu 


Cys 


1 








5 










10 










15 


Ser 


Arg 


Phe 


Ala 


Gin 


Asp 


Leu 


Trp 


Leu 


Glu 


Gin 


Asn 


lie 


Lys 


Asp 










20 










25 










30 


Ser 


Phe 


Gin 


Lys 


Val 


Thr 


Leu 


Ser 


Arg 


Tyr 


Gly 


Lys 


Tyr 


Gly 


His 



33/78 



wo 03/006618 



PCT/US02/21971 













Lys 


Asn 


Leu 


Gin 


Leu 
o U 


Lys 


Glu 


His 


Gin 


Gly 


He 


Thr 


Thr 


Ser 


Lys 
o u 


Met 


His 


Lys 


Phe 


Ser 


Glu 


Asn 


Lys 


His 


Phe 


Met 


Leu 


Ser 


Arg 


Leu 


Asn 


Phe 


Tyr 


Lys 


Cys 


Thr 


Asn 


Leu 


Ser 


Lys 

JLDO 


Tyr 


Lys 


Cys 


Glu 


Val 

J, / U 


Leu 


Thr 


Lys 


His 


Lys 

1 Q C 


Cys 


Ala 


His 


Cys 


Gly 

o n rv 
Z U U 


Arg 


His 


Lys 


He 


He 
zl5 


Gin 


Cys 


Gly Lys 


Val 










230 


Gin 


He 


He 


Tyr 


Thr 
245 


Gly 




Ala 


Phe 


Asn 
260 



<210> 27 

<211> 509 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 
<223> Incyte ID No: 

<400> 27 



Met 


Ala 


Leu 


Ser 


Gin 


1 








5 


Glu 


Phe 


Ser 


Gin 


Glu 

20 


Thr 


Leu 


Tyr 


Arg 


Asp 
35 


Ser 


Leu 


Asp 


He 


Ser 
50 


Thr 


Gly 


Gin 


Gly 


Asn 
65 


Arg 


Gin 


Ala 


Ser 


Tyr 
80 


Glu 


Lys 


Asp 


He 


His 
95 


Thr 


Asn 


Asp 


His 


Glu 
110 


Ser 


Ser 


Thr 


Asp 


Arg 
125 


He 


Lys 


Gly 


Gin 


Leu 
140 


His 


Arg 


Arg 


He 


His 











40 


Arg 


Lys 


Gly 


Cys 


Lys 
55 


Gly 


Tyr 


Asn 


Gly Leu 










/ U 


He 


Phe 


Gin 


Cys 


Asn 

OD 


Asn 


Ser 


Asn 


Arg 


His 

XL) U 


Arg 


Cys 


Lys 


Glu 


Cys 
115 


Thr 


Gin 


His 


Lys 


Lys 


Glu 


Glu 


Cys 


Gly 


Lys 
145 


Pro 


Lys 


Lys 


He 


His 

1 ^ A 

XoO 


Cys 


Gly 


Lys 


Ala 


Phe 

1/5 


He 


He 


Arg 


Thr 


Gly 
190 


Lys 


Ala 


Phe 


Lys 


Gin 

205 


His 


Thr 


Glu 


Glu 


Lys 
220 


Phe 


Lys 


Gin 


Ser 


Pro 
235 


Gly 


Glu 


Glu 


Pro 


Tyr 
250 


Leu 


Ser 









2015706CD1 



Gly 


Leu 


Leu 


Thr 


Phe 










10 


Glu 


Trp 


Lys 


Cys 


Leu 










25 


Val 


Met 


Leu 


Glu 


Asn 










40 


Ser 


Arg 


Cys 


Met 


Met 










55 


Thr 


Glu 


Val 


He 


His 










70 


His 


He 


Gly 


Ala 


Phe 










85 


Asp 


Phe 


Val 


Phe 


Gin 










100 


Ala 


Pro 


Met 


Thr 


Glu 










115 


Tyr 


Asp 


Gin 


Arg 


His 










130 


Glu 


Ser 


Arg 


Phe 


His 










145 


Thr 


Gly 


Glu 


Lys 


Pro 













Ser 


Val 


Asp 


Glu 


Cys 
o u 


Asn 


Gin 


Cys 


Leu 


Lys 
1 o 


Lys 


Tyr 


Val 


Lys 


Val 
y u 


Lys 


He 


Arg 


His 


Thr 


Asp 


Lys 


Ser 


Leu 


Cys 
x^ U 


He 


His 


Thr 


Arg 


Glu 

XOO 


Thr 


Phe 


Asn 


Trp 


Ser 

XD U 


Thr 


Gly 


Glu 


Lys 


Pro 
Xbo 


His 


Gin 


Ser 


Ser 


He 

XoU 


Glu 


Lys 


Pro 


Tyr 


Lys 

1 Q R 

xyo 


Ser 


Ser 


His 


Leu 


Thr 

zlO 


Pro 


Tyr 


Lys 


Cys 


Glu 
225 


Thr 


Leu 


Thr 


Lys 


His 
240 


Lys 


Cys 


Glu 


Glu 


Cys 
255 



Arg Asp 


Val 


Ala 


He 










15 


Asp 


Pro 


Ala 


Gin 


Arg 
30 


Tyr 


Arg 


Asn 


Leu 


Val 
45 


Asn 


Thr 


Leu 


Ser 


Ser 
60 


Thr 


Gly 


Thr 


Leu 


Gin 
75 


Cys 


Ser 


Gin 


Glu 


He 
90 


Trp 


Gin 


Glu 


Asp 


Glu 
105 


He 


Lys 


Lys 


Leu 


Thr 
120 


Ala 


Gly 


Asn 


Lys 


Pro 
135 


Leu 


His 


Leu 


Arg 


Arg 
150 


Tyr 


Lys 


Cys 


Glu 


Glu 



34/78 



wo 03/006618 
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1 n: c 
XOD, 


Cys 


JL U. 


Lys 


Val 


irne 

JL / U 


He 


He 


His 


Thr 


Gly 

J. oD 


Lys 


Ala 


Phe 


Lys 


His 

o n n 


His 


Arg 


Gly Asp 


Lys 












Phe 


Asp 


Gin 


Lys 


Ala 
o "3 n 


Gly 


Glu 


Lys 


Pro 


Tyr 

O /I K 


Gin 


Thr 


Ser 


His 


Leu 

o ^ n 


Lys 


Pro 


Tyr 


Lys 


Cys 

O '7 c 
Z /D 


Ser 


Val 


Leu 


Val 


He 

O Q A 

ZyU 


Tyr 


Lys 


Cys 


Asn 


Glu 

o r\ c 


Leu 


Ala 


Gly His 


Arg 










TOO 

o Z U 


Cys 


Glu 


Glu 


Cys 


Asp 

'D C 

J5 J5 D 


Arg 


His 


Arg 


Arg 


He 

*3 c r\ 


Val 


Cys 


Asp 


Lys 


Ala 

•3 R 
O O D 


Gin 


Arg 


Val 


His 


Thr 
"3 Q n 


Gly 


Lys 


Val 


Phe 


Ser 
"3 Q 

o y 3 


Leu 


His 


Thr 


Gly 


Glu 

4I:XU 


Val 


Tyr 


He 


Arg 


Lys 
/IOC 


Thr 


Gly 


Glu 


Lys 


Pro 

44 U 


Asn 


Ser 


Pro 


Ser 


His 
455 


Gin 


Lys 


Ser 


Tyr 


Lys 
470 


Arg 


Ser 


Leu 


Leu 


Ala 
485 


Cys 


Phe 


Lys 


Cys 


Asn 
500 



<210> 28 
<211> 310 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> inisc_f eature 
<223> Incyte ID No: 

<400> 28 

Met Ser Gin Gin Leu 
1 5 
Gly Leu Gly Gly Arg 
20 

Lys Ser Ser Gin Asp 











± D u 


Ser 


Cys 


Lys 


Ser 


His 

1 "7 
J. / 3 


Glu 


Lys 


Pro 


Tyr 


Lys 

1 QD 


Asp 


Ser 


His 


Leu 


Ala 
0 n R 

Z U D 


His 


Tyr 


Thr 


Cys 


Asn 

ZAU 


Thr 


Leu 


Ala 


Cys 


His 

0 "3 R 
ZjD 


Lys 


Cys 


Asn 


Glu 


Cys 

o R n 
Zdu 


Val 


Tyr 


His 


His 


Arg 

O R 
ZOO 


Asn 


Glu 


Cys 


Gly 


Lys 

o Q n 
ZoU 


His 


Lys 


Ala 


Val 


His 

r> Q R 

zy D 


Cys 


Gly 


Lys 


Val 


Phe 

o JLU 


Arg 


Val 


His 


Thr 


Gly 

O R 

oZd 


Lys 


Val 


Phe 


Ser 


Arg 

Q /I n 
o 4 U 


His 


Thr 


Gly 


Glu 


Lys 

"3 R R 

J 3 


Phe 


Arg 


Ser 


Asp 


Ser 
T T n 


Gly 


Glu 


Arg 


Pro 


Tyr 

"3 P R 


Thr 


Lys 


Ala 


Tyr 


Leu 
/inn 


Lys 


Leu 


Tyr 


Glu 


Cys 

/I 1 R 


Ser 


His 


Leu 


Glu 


Arg 

Aid 
4to U 


His 


Lys 


Cys 


Gly 


Asp 

/ /I R 
44 3 


Leu 


He 


Arg 


His 


Gin 
460 


Cys 


His 


Gin 


Cys 


Gly 
475 


Glu 


His 


Gin 


Lys 


He 
490 


Glu 


Tyr 


Ser 


Lys 


Pro 
505 



6920755CD1 



Lys Lys Arg Ala Lys 
10 

Ala Pro Ser Gly Ala 
25 

Leu Gin Ala Glu He 











J_ O 3 


Leu 


Glu 


He 


His 


Arg 

X o vJ 


Cys 


Lys 


Val 


Cys 


Asp 

JL J? 3 


Lys 


His 


Thr 


Arg 


He 

91 n 

Z J. u 


Glu 


Cys 


Gly 


Lys 


Val 

9 9 R 
ZZD 


His 


Arg 


Ser 


His 


Thr 

9 An 
z ^ u 


Gly 


Lys 


Thr 


Phe 


Ser 

9 R R 
Z33 


Leu 


His 


Thr 


Gly 


Glu 

97n 
z / u 


Thr 


Phe 


Ala 


Arg 


Asn 

9 Q R 
Z O 3 


Thr 


Ala 


Glu 


Lys 


Pro 

^3 n n 
o u u 


Lys 


Gin 


Arg 


Ala 


Thr 

•31 R 
J5 J.3 


Glu 


Lys 


Pro 


Tyr 


Arg 
"3 n 

J5 3 U 


Lys 


Ser 


His 


Leu 


Glu 

"3 /I R 
3 43 


Pro 


Tyr 


Lys 


Cys 


Lys 
"3 <^ n 

3 D U 


Arg 


Leu 


Ala 


Glu 


His 

-37c; 
3/3 


Thr 


Cys 


Asn 


Glu 


Cys 

"3 Q n 
3 y u 


Ala 


Cys 


His 


Gin 


Lys 
An R 

4U 3 


Glu 


Glu 


Cys 


Asp 


Lys 
A9 n 


His 


Arg 


Arg 


He 


His 

A "3 R 
43 3 


Cys 


Gly 


Lys 


Ala 


Phe 

A R n 

4t 3 U 


Arg 


± ±e 


TT A r-> 

JrllS 


inr 


r*! IT 

ij±y 
465 


Lys 


Val 


Phe 


Ser 


Leu 
480 


Pro 


Phe 


Gly 


Asp 


Asn 
495 


Ser 


Ser 


He 


Asn 





Thr Arg His Gin Lys 
15 

Lys Pro Arg Gin Gly 
30 

Glu Pro Val Ser Ala 



35/78 



wo 03/006618 



PCT/US02/21971 











O D 


Val 


Trp 


Ala 


Leu 


Cys 

D U 


Gin 


Ala 


Leu 


Gly 


Gly 


Val 


He 


Arg 


Gly 


Glu 

oU 


Leu 


Phe 


Glu 


Ser 


Leu 

Q c; 


Leu 


Ser 


Gin 


Lys 


Val 
Tin 


Glu 


Tyr 


Met 


Lys 


Lys 


Val 


Gly 


Glu 


Asn 


Ser 


Lys 


Leu 


Pro 


Pro 


Gly 


Lys 


Gin 


Leu 


Ala 


Glu 

± / U 


Glu 


Tyr 


Asp 


Ser 


Leu 

IOC 


Thr 


Arg 


Lys 


Leu 


Arg 

O A A 


lie 


His 


Gly 


Pro 


Arg 

ZlD 


Phe 


Val 


Glu 


Ser 


Ser 

OTA 

23 0 


Gly 


Glu 


Lys 


Pro 


Phe 

Z4LD 


Phe 


Ser 


Leu 


Asp 


Phe 

O /T A 


Gly 


Glu 


Lys 


Arg 


Phe 
275 


Phe 


He 


Gin 


Ser 


Asn 
290 


Asn 


Thr 


Asn 


Lys 


Asn 
305 



<210> 29 
<211> 402 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 
<223> Incyte ID No: 

<400> 29 



Met 


Ala 


Ala 


Val 


He 


1 








5 


Phe 


Pro 


Ala 


Ser 


Gin 










20 


Val 


Asn 


Glu 


Leu 


Leu 










35 


Glu 


Asp 


Val 


Ala 


Val 










50 


Asp 


Pro 


Ala 


Gin 


Arg 










65 


Cys 


Arg 


Asn 


Leu 


Ala 










80 


Leu 


He 


Ser 


Gin 


Leu 










95 


Arg 


Gly 


He 


Leu 


Pro 











40 


Asp 


Gly 


Tyr 


Val 


Cys 
55 


Asp 


Asp 


Phe 


Ser Asp 










/ 0 


Phe 


Ser 


Gin 


Pro 


He 

o c 


Glu 


Tyr 


Leu 


Lys 


ijys 

1 A O 

1 uU 


Phe 


Glu 


Ala 


Ser 


Ser 

tic 

115 


Gly 


Val 


Lys 


Lys 


Glu 

1 A 

130 


Leu 


Glu 


Tyr 


Ser 


Glu 

145 


Gly 


He 


Pro 


Gly 


He 
160 


Phe 


Ala 


Arg 


Lys 


Lys 
175 


Ser 


Ala 


He 


Ala 


Cys 
190 


Asp 


Arg 


Ala 


Ala 


Leu 

205 


Asp 


His 


Val 


Cys 


Ala 
220 


Lys 


Leu 


Lys 


Arg 


His 
235 


Arg 


Cys 


Thr 


Phe 


Glu 
250 


Asn 


Leu 


Arg 


Thr 


His 
265 


Val 


Cys 


Pro 


Phe 


Gin 
280 


Asn 


Leu 


Lys 


Ala 


His 
295 


Glu 


Gin 


Glu 


Gly Lys 










310 



444179CD1 



Leu 


Pro 


Ser 


Thr 


Ala 










10 


Gin 


Lys 


Gly 


His 


Thr 










25 


Thr 


Ser 


Trp 


Leu Arg 










40 


Glu 


Phe 


Thr 


Gin 


Glu 










55 


Thr 


Leu 


Tyr 


Arg 


Asp 










70 


Ser 


Leu 


Gly 


Cys 


Arg 










85 


Glu 


Gin 


Asp 


Lys 


Lys 










100 


Ser 


Thr 


Cys 


Pro 


Asp 



36/7 











45 


Tyr 


(jXU 


Pro 


CjXy 


Pro 










C A 


Cys 


Tyr 


x±e 


/~<n 11 
WXU 


Cys 










/ b 


Leu 


CjxU 


(j±U 


Asp 


Ser 










A A 


LiXy 


Ser 


LjXU 


VjyXn 


LiXn 










1 A C 

XUd 


Leu 


CuJ-U 


Cys 


Ser 


Leu 










1 O A 

Xz U 


Leu 


Pro 


Gin 


Lys 


xxe 










135 


Tyr 


Met 


Thr 


G±Y 


Lys 










1 C A 

Xd u 


Asp 


T «i 1 

Leu 


Ser 


Asp 


Pro 










165 


Pro 


Pro 


xxe 


Asn 


Lys 










180 


Pro 


Gin 


Ser 


Gly 


Cys 










195 


Arg 


Lys 


rllS 


Leu 


Leu 










210 


Glu 


Cys 


Gly 


Lys 


Ala 










225 


Fne 


Leu 


V ax 


JrtlS 


Thr 










240 


Gly 


Cys 


Gly 


Lys 


Arg 










255 


Val 


Arg 


He 


His 


Thr 










270 


Gly 


Cys 


Asn 


Arg 


Arg 










285 


He 


Leu 


Thr 


His 


Ala 










300 


Ala 


Pro 


Ser 


Ser 


Leu 










15 


Gin 


Gly 


Gly 


Glu 


Leu 










O A 

3 0 


Gly Leu Val 


Thr 


Pne 










45 


Glu 


Trp 


Ala 


Leu 


Leu 










60 


Val 


Met 


Leu 


Glu 


Asn 










75 


Val 


Asn 


Lys 


Pro 


Ser 










90 


Val 


Val 


Thr 


Glu 


Glu 










105 


Leu 


Glu 


Thr 


Leu 


Leu 



wo 03/006618 
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110 115 120 



T."\7"C! 






Trp 


Leu 


rnV. -v- 
±xJX 


XT X. \J 


Ti\7Cl 

JLljr 0 


Lys 


Asn 


Val 


Phe 


Arg 


Lys 


Glu 










125 










13 0 










135 






T Arcs 


Gly Val 






Glu 




Ser 


His 




Gly Val 


Lys 










1 An 










145 










150 




"A C2T1 


m n 




7\ en 






IT i its 


J-l jr 0 


V CJ.-1- 


xrxxc 


Ser 


X XXJL 


fa 


Ser 










XD 3 










160 










165 


7\ on 
jri.»DXX 


j_i t; u. 


JL IIJ. 




rlX S 






-L -L t: 


His 


Thr 


vir_L _y 


Glu 


-i_j_Y fa 




J- JT 










1 "7 n 
X / U 










175 










180 


Asp 


v_ys 






L-.ys 




Lys 








0 t:J_ 


Arg 






Leu 










X OD 










J- J' w 










195 








Lys 


Arg 


XX^ 


XlX 0 


Asn 


ijxy 


\J-L LL 


xiy fa 


XT J_ 


m 

Tyr 


r*i n 


^—jf fa 










0 n n 
U U 










205 










210 








vj±y 


Lys 


jTIXCL 




06X 










Leu 


Arg 


Leu 










^Xd 










^ ^ \J 










99R 

<cj ^ ~J 




T 


Arg" 


xxe 


His 


rpT-, -y^ 
±11X 


Vjxy 


oX U. 


Lys 


XT J, KJ 


Tyr 


Glu 


Cys 


Asn 


VJ7 J-XX 










0 Q A 




















94.(1 


Cys 




XlXS 


vax 


Jrxie 


Arg 


X llX 


oer 


v^ys 


Asn 


Leu 


Lys 


Ser 


xlxS 


J_i_y fa 










0 yi c 
z4d 










9Rn 

Z ^ u 










9 

£U ^ ~J 


Aircf 




XlXS 


Tiir 


/-I n ^ _ 

Ca^xy 


virX U. 


Asri 


AXS 


XIXS 


vjX U. 


Cys 


Asn 


(jxn 


Cys 






















9 fiR 

^ 0 «^ 










970 


Liys 




iriie 


Ser 


Thr 


Arg 


Ser 


Ser 


T 

Leu 


±IIX 


Gly His 


Asn 


Ser 


J- X t; 










0 "7 R 










9 R n 
z 0 u 










9 

^ 0 ~j 


rllS 


Tin IT 




IjXlJ. 


Lys 


irX 0 


Tyr 


kaXU 


V-.^ fa 


XIX fa 


Asp 


Cys 


Gly 


Lys 


JL XXJ_ 










r> Q n 
zy u 










295 










3 00 






Ly s 


Ser 


Ser 


iyr 


Leu 


JL liX 


Gin 


His 


Val 


Arg 


Thr 


g 

XIX s 


X XXJL 










0 UD 










310 










315 








Pro 




Vj-L tj. 




AS3n 


Glu 


Cys 


Gly Lys 


Ser 


Phe 


Ser 










320 










325 










330 


Gov 






Ser 


Leu 




V Cl J. 


nxo 


Lys 


Arg 


lie 


His 


Thr 


Gly 


Cl 11 










335 










340 










345 


Lys 


Pro 


Tyr 


Glu 


Cys 


Ser 


Asp 


Cys 


Gly Lys 


Ala 


Phe 


Asn 


Asn 


Leu 










350 










355 










360 


Ser 


Ala 


Val 


Lys 


Lys 


His 


Leu 


Arg 


Thr 


His 


Thr 


Gly Glu 


Lys 


Pro 










365 










370 










375 


Tyr 


Gin 


Cys 


Asn 


His 


Cys 


Gly 


Lys 


Ser 


Phe 


Thr 


Ser 


Asn 


Ser 


Tyr 










380 










385 










390 


Leu 


Ser 


Val 


His 


Lys 


Arg 


lie 


His 


Asn 


Arg 


Trp 


lie 
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<400> 30 



Met 


Ser 


Asn 


Glu 


Leu 


Asp 


Phe 


Arg 


Ser 


Val 


Arg 


Leu 


Leu 


Lys 


Asn 


1 








5 










10 










15 


Asp 


Pro 


Val 


Asn 


Leu 


Gin 


Lys 


Phe 


Ser 


Tyr 


Thr 


Ser 


Glu 


Asp 


Glu 










20 










25 










30 


Ala 


Trp 


Lys 


Thr 


Tyr 


Leu 


Glu 


Asn 


Pro 


Leu 


Thr 


Ala 


Ala 


Thr 


Lys 










35 










40 










45 


Ala 


Met 


Met 


Arg 


Val 


Asn 


Gly 


Asp 


Asp 


Asp 


Ser 


Val 


Ala 


Ala 


Leu 










50 










55 










60 


Ser 


Phe 


Leu 


Tyr 


Asp 


Tyr 


Tyr 


Met 


Gly 


Pro 


Lys 


Glu 


Lys 


Arg 


He 










65 










70 










75 


Leu 


Ser 


Ser 


Ser 


Thr 


Gly 


Gly 


Arg 


Asn 


Asp 


Gin 


Gly Lys 


Arg 


Tyr 










80 










85 










90 


Tyr 


His 


Gly 


Met 


Glu 


Tyr 


Glu 


Thr 


Asp 


Leu 


Thr 


Pro 


Leu 


Glu 


Ser 
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Pro Thr His 
Pro Glu Tyr 
Glu Gly Ala 
Gly Pro Ser 
Pro Thr Thr 
Glu Ser He 
Ser Thr Phe 
He Leu Lys 
Ser Leu Lys 
He His He 
Gly Gin Phe 
Gly Leu Ala 
Val Phe Asp 
Lys His Trp 
Asp Val Ala 
Glu Glu Val 
Glu Glu Ala 
Phe Ser Ser 
He Asp Thr 
Arg Ala Val 
Arg Lys Met 
Lys Cys Pro 
Ser Gly Phe 
Asp Leu Glu 
Ser Ser Leu 
Ser Ser Ser 
Thr Glu Glu 
Asp Leu Gin 
Val Phe Asp 
Arg Asn Ala 
Tyr Lys Val 









T 

Leu 


iYie c 


Lys 








XT X. \J 




Leu 








T 

Leu 


Pro 


i nxr 




1 AO 






T .f=i1 1 

J—it: U. 






X 3 D 




Asp 




m 

Tyr 




1 "7 n 

X / u 






ijxy 






XOO 




T 

Lys 


Asp 


Asp 




z uu 




Txir 


Ser 


Pro 




Z X J 




Ser 


Asp 


Fne 




O "3 A 

Zo U 




Lys 


Ser 


Pi ir 
tjrXy 




O >l c: 




Tyr 


Pro 


vax 




o n 

Z D u 




Leu 


en CIT^ 






O '"7 c 




Asn 


IjXU 


Lys 




o Q n 




XlJ. S 


Ser 


Arg 




"3 n R 




Asp 


^ys 


xjys 




"7 n 
^ z u 








Asn 




"5 R 
O J O 




Lys 


vax 


jfne 




R n 
J D U 






Lys 


Pl -VT- 

\jxy 




•3 R 




Tyr 


Asp 


Cys 




"5 Q A 
O O U 




Cys 


ijxn 


xxe 




Q R 

J y 3 




Arg 


Asp 


Asp 








Asp 


Ser 


Ser 




yi o R 




Arg 


vj±y 


Asn 








inr 


Pro 


Pro 




/I R R 




isjun 


Arg 


Ser 




4t / u 




Asn 


Arg 


Leu 




yi Q R 




Phe 




Pro 




R n n 




Arg 


V a.X 


Leu 




515 




Ala 


Leu 


Met 




530 




He 


Ser 


Glu 




545 




Tyr 


Lys 


Lys 




560 





Phe Leu Thr 
Leu Lys Lys 
Pro Gly Lys 
Ala Gly Ser 
Asp Asn Gly 
Pro Pro Thr 
Pro Gin Glu 
Glu Pro Pro 
Glu Tyr Thr 
Glu Ser Pro 
Thr Leu Arg 
Asn Lys Val 
Val Pro Val 
Gin Pro Thr 
Glu Asn Phe 
Ala Leu Ser 
He Gly Val 
Val Lys Gly 
Gly Leu Gly 
Lys He Phe 
Glu Arg Lys 
Asn Ser Gly 
Glu Thr Thr 
Val Leu Phe 
Gly Gly Ala 
Pro Leu Lys 
Leu Pro Ser 
Leu Tyr Val 
Leu Lys Thr 
Lys Tyr Gly 
Cys Lys Arg 



inn 
X u u 






Pi 71 


Asn 


vctx 


1 1 R 
XX 3 






Asn 


Asn 


Leu 


1 n 

X3 \J 










JT JL V— ' 


1 AR 






vax 


Asp 


C* /~\-^ 


X D U 






OcX 


Leu 


Asn 


1 "7 R 
X / D 






PI n 


Arg 


m 

Trp 


xy u 






oer 




Leu 


o n R 






Cys 


Pro 


Pl 11 

(jXU 








Leu 


Pi IT- 

vjxy 


Ser 


O "3 R 






JMeu 


Ala 

AX a 


Tyr 


o R n 






Thr 


Pro 


A 1 S3 

Axa 


O R 
Z D D 






Lys 


Ser 


vax 


1 Q n 
z o U 






PI n 


PI -n 


Leu 


9 Q R 








Lys 


Pl -n 


.3 X U 






Asn 


Tnr 


va.x 


"5 O R 
J AD 








Vo.X 


Trp 


-J 4t U 






7\ GTI 


^ J to 


T .01 1 


R R 






vax 


Pro 


T 

Leu 


J / u 






± IIX 


r2l n 


A "KTT 


"5 Q R 
O O D 






Pt ro 


Asp 


T 

Lys 


Ann 






CjXn 


pne 


Arg 


/I 1 R 






vax 


Lys 


^jfXy 


4 J u 






Tyr 


Leu 


Arg 


/I /I R 






X Xfc; 


Pro 


Asn 


ARC) 
f± o u 






XTiXCt 


JrX U 


OCX 


/I *? R 






Arg 


Tnr 


Cys 


Q n 






Lys 


Pl m 

vjxn 


TV 1 = 

AX a 


R n R 

O U 3 






Arg 


Arg 


1 T 1 

kjXU 


C O A 






Pro 


Asp 


Leu 


535 






Phe 


Pro 


Glu 


550 






Gly 


He 


Leu 


565 











1 n R 

X U D 


osx 


Vjxy 


Thr 






1 O A 

xz u 




Ser 


Leu 






1 *5 R 

xo o 




±rX O 


Al =3 






IRA 
X D U 


iyr 


Leu 


Leu 






1 fiR 
X D O 


Ofe=X 


Leu 








1 Q A 
Xo U 


Pl n 

vjxn 


Pro 


Asp 






1 O R 


jtrne 


Pro 


Asp 






O 1 A 

z xu 


Asp 


lyr 


ir X <j 






DOC 

Z Z 5 


Pro 


Lys 


TV T ^ 

AX a 






O /I A 


Leu 


Asn 


Lys 






O R R 

Zoo 


Pl -tr 

vjxy 


O 1 t r 

LrXy 


Lys 






O "7 A 

z / u 


vax 


Met 


vax 






o o c 
Zoo 


Arg 


Fne 


Trp 






"3 A A 

o U U 


Arg 


vax 


Tl j= 
XX€ 






1 R 
^ XO 


Pl T1 


xlxS 


Tl i=» 

xxe 






"3 ^5 A 


Asn 


vax 


Asn 






"3 /I R 


Ser 


Tnr 


Asp 






"3 *C A 
O D U 


Asn 


Leu 


Pl n 

vjxn 






•2 "7 R 
J /O 


Leu 


vax 


TT-? — , 

rllS 






"5 Q A 

o y U 


Gly 


TV 1 ^ 

AXa 


Glu 






4Uo 


Arg 


Lys 


vax 






/I O A 

4zU 


Cys 


Leu 


Leu 






^ o c 
4 Jo 


Pro 


/^T t-i 

tjiXU 


Thr 






yi R A 


Va.X 


XlXo 








yi R 

4b O O 


TV T --1 

AX a 


^jXy 


Pro 






y1 Q A 
4t O U 


Ser 


Pro 


jrne 






yi o c 

4y O 


Lys 


CsXU 


GXy 






R I A 
O X U 


Thr 


Glu 


Glu 






ET O ET 

525 


Lys 


Gly 


Leu 






540 


Glu 


Asn 


He 






555 


Val 


Asn 


Met 






570 
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Asp Asn Asn He He 
575 

lieu Asp Met Gly Glu 
590 

Glu Leu 



Gin His Tyr Ser Asn 
580 

Leu Asp Gly Lys He 
595 



His Val Ala Phe Leu 
585 

Gin He He Leu Lys 
600 



<210> 31 
<211> 816 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> inisc_f eature 
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Met 


Glu 


He 


Gly 


Ser 


1 








5 


Met 


Va± 


Pro 


Arg 


Arg 
20 


Lys 


Leu 


Leu 


AX a 


Asn 

35 


Val 


Tyr 


Leu 


Tyr 


Glu 
50 


Arg 


Val 


Asn 


Arg 


Glu 
65 


VaJ_ 


Thr 


-T- "1 ^ 

He 


Phe 


Gly 
80 


Ser 


Leu 


Tyr 


Thr 


Ajia 
95 


Asp 


Leu 


Asp 


VaJ_ 


Thr 
110 


Phe 


Lys 


Val 


Ser 


He 
125 


Arg 


Ser 


Phe 


Phe 


Ser 
140 


Gly 


Gly 


Arg 


Glu 


Val 

155 


Ala 


Met 


Trp 


Lys 


Met 
170 


Phe 


Tyr, 


Lys 


Ala 


Gin 
185 


Asp 


He 


His 


Asn 


He 
200 


His 


Arg 


Val 


Lys 


Phe 
215 


Val 


Thr 


His 


Cys 


Gly 
230 


Val 


Thr 


Arg 


Arg 


Pro 
245 


Glu 


Asn 


Gly 


Gin 


Thr 
260 


Glu 


Lys 


Tyr 


Thr 


Leu 

275 


Gin 


Val 


Gly 


Gin 


Glu 
290 


Cys 


Asn 


He 


Val 


Ala 
305 


Asn 


Gin 


Thr 


Ser 


Thr 
320 


Asp 


Arg 


Gin 


Glu 


Glu 
335 



7493789CD1 



■Ti n _ 

Ala 


Gly 


Pro 


Ala 


Gly 
10 


Pro 


Gly 


Tyr 


Gly Thr 










25 


Cys 


Phe 


GXn 


Val 


Glu 

40 


VaJ. 


Asp 


ixe 


Lys 


Pro 
55 


Val 


Val 


Asp 


Ser 


Met 
70 


Asp 


Arg 


Arg 


Pro 


Val 

o c 
oD 


Asn 


Pro 


Leu 


Pro 


Val 
100 


Leu 


Pro 


uxy 


Glu 


Gly 
115 


Lys 


Phe 


Va± 


Ser 


Arg 
130 


Ala 


Pro 


Glu 


Gly 


Tyr 
145 


Trp 


Phe 


Gly 


Phe 


His 

160 


Met 


Leu 


Asn 


He 


Asp 
175 


Pro 


Val 


He 


Gin 


Phe 
190 


Asp 


Glu 


Gin 


Pro 


Arg 
205 


Thr 


Lys 


Glu 


He 


Lys 
220 


Thr 


Met 


Arg 


Arg 


Lys 

235 


Ala 


Ser 


His 


Gin 


Thr 
250 


Val 


Glu 


Arg 


Thr 


Val 
265 


Gin 


Leu 


Lys 


Tyr 


Pro 

280 


Gin 


Lys 


His 


Thr 


Tyr 
295 


Gly 


Gin 


Arg 


Cys 


He 
310 


Met 


He 


Lys 


Ala 


Thr 
325 


He 


Ser 


Arg 


Leu 


Val 
340 



Ala 


Gin 


Pro 


Leu 


Leu 
15 


Met 


Gly Lys 


Pro 


xxe 










30 


He 


Pro 


Lys 


xxe 


Asp 
45 


Asp 


Lys 


Cys 


Pro 


Arg 
60 


Val 


Gin 


His 


Phe 


Lys 
75 


Tyr 


Asp 


Gly 


Lys 


Arg 
90 


Ala 


Thr 


Thr 


<jXy 


vax 
105 


Gly Lys 


Asp 


Arg 


Pro 










120 


Tyr 


Thr 


Pro 


vax 


Gly 
135 


Asp 


His 


Pro 


Leu 


Gly 
150 


Gin 


Ser 


Val 


Arg 


Pro 

165 


Val 


Ser 


Ala 


Thr 


Ala 
180 


Met 


Cys 


Glu 


Val 


Leu 
195 


Pro 


Leu 


Thr 


Asp 


Ser 
210 


Gly 


Leu 


Lys 


Val 


Glu 
225 


Tyr 


Arg 


Val 


Cys 


Asn 
240 


Phe 


Pro 


Leu 


Gin 


Leu 
255 


Ala 


Gin 


Tyr 


Phe 


Arg 
270 


His 


Leu 


Pro 


Cys 


Leu 

285 


Leu 


Pro 


Leu 


Glu 


Val 
300 


Lys 


Lys 


Leu 


Thr 


Asp 
315 


Ala 


Arg 


Ser 


Ala 


Pro 
330 


Arg 


Ser 


Ala 


Asn 


Tyr 
345 
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Glu 


Thr 


Asp 


Pro 


Phe 
n 

O 3 U 


Val 


Gin 


Glu 


Phe 


Glu 


Met 


Ala 


His 


Val 
^ 

•3 O O 


Thr 


Gly 


Arg 


Val 


Gin 


Tyr 


Gly 


Gly 


Arg 
o o u 


Asn 


Arg 


Thr 


Val 


Val 


Trp 


Asp 


Met 


Arg 

Q R 


Gly 


Lys 


Gin 


Phe 


Lys 


Met 


Trp 


Ala 


He 

^ J_ u 


Ala 


Cys 


Phe 


Ala 


Glu 


Glu 


lie 


Leu 


Lys 


Gly 


Phe 


Thr 


Asp 


Lys 


Asp 


Ala 


Gly 


Met 
/I /I n 

41:41 u 


Pro 


He 


Gin 


Gly 


Tyr 


Ala 


Gin 


Gly 


Ala 

4i:D D 


Asp 


Ser 


Val 


Glu 


Lys 


Asn 


Thr 


Tyr 


Ser 

41: / U 


Gly 


Leu 


Gin 


Leu 


Gly 


Lys 


Thr 


Pro 


Val 

/I Q R 


Tyr 


Ala 


Glu 


Val 


Leu 


Leu 


Gly 


Met 


Ala 

c n n 
3 U U 


Thr 


Gin 


Cys 


Val 


Lys 


Thr 


Ser 


Pro 


Gin 

D ± D 


Thr 


Leu 


Ser 


Asn 


Val 


Lys 


Leu 


Gly 


Gly 

R n 
D J u 


He 


Asn 


Asn 


He 


Pro 


Ser 


Val 


Phe 


Gin 

A Ci 
D 4fc D 


Gin 


Pro 


Val 


He 


Thr 


His 


Pro 


Pro 


Ala 

O D u 


Gly 


Asp 


Gly 


Lys 


Val 


Val 


Gly 


Ser 


Met 

R '7 R 


Asp 


Ala 


His 


Pro 


Val 


Arg 


Val 


Gin 


Arg 
R Q n 

D 27 u 


Pro 


Arg 


Gin 


Glu 


Ser 


Met 


Val 


Arg 


Glu 


Leu 


Leu 


He 


Gin 


Phe 


Lys 


Pro 


Thr 


Arg 
o n 


He 


He 


Phe 


Tyr 


Gly 


Gin 


Phe 


Arg 


Gin 
"5 R 


Val 


Leu 


Tyr 


Tyr 


Glu 


Ala 


Cys 


He 


Ser 

DDI/ 


Leu 


Glu 


Lys 


Asp 


Tyr 


lie 


Val 


Val 


Gin 
^ ^ R 

ODD 


Lys 


Arg 


His 


His 


Asp 


Arg 


Thr 


Glu 


Arg 

D oU 


Val 


Gly 


Arg 


Ser 


Thr 


Thr 


Val 


Asp 


Thr 

<^ Q R 
D ^ O 


Asp 


He 


Thr 


His 


Tyr 


Leu 


Cys 


Ser 


His 
"71 n 

/ J- u 


Ala 


Gly 


He 


Gin 


His 


Tyr 


His 


Val 


Leu 

■"7 O C 
/AO 


Trp 


Asp 


Asp 


Asn 


Leu 


Gin 


Leu 


Leu 


Thr 


Tyr 


Gin 


Leu 


Cys 


Thr 


Arg 


Ser 


Val 


Ser 

■7 R R 


He 


Pro 


Ala 


Pro 


V a.x 


Ala 






AT 3 

770 




±yj. 


His 


Leu 


Ser 


Ala 


Glu 


Gly 


Ser 
785 


His 


Val 


Ser 


Gly 


Pro 


Gin 


Ala 


Leu 


Ala 
800 


Lys 


Ala 


val 


Gin 


Arg 


Thr 


Met 


Tyr 


Phe 


Ala 
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±-iy & 


Val 


Arg 




^ S S 

J w/ J 










3 D \J 


Leu 


JTX k-* 


Ala 


Pro 


Met 


Leu 


^70 










'^7R 


Ala 


J. XJLX 


JrX w 


Ser 


His 


Ol \7' 


J o ^ 










J -7 


His 


Thr 


VjX jf 


Val 


Glu 


He 


Ann 










405 


J. XJ.±. 


vjrXXX 


jtix y 


fil n 


V— Jf o 


^x y 


A1 R 

r± X J 










A9 0 




Leu 


fix y 




Ti 
X X 


OtiX 












A"^ R 

^3 3 


Gin 


Pro 


Cys 


Phe 


Cys 


Lys 


AA R 
4t4fc O 










ARO 
^3 V/ 






iri JLc: 


Arg 


TTt cs 
XiXo 


XICU. 


*± D U 










*i D 3 


X X 


X X ti 


V ctx 


Ti ^ 

X X t! 




It X 


A7 R 










^ O VJ 


Lys 


Arg 


vax 


vjx_y 


Asp 


±11X 


AQn 










AQR 

J73 


V7X11 


V O.X 


xj_y & 




Val 

V CIX 


Ti P» 

XXC: 


R n R 










R1 n 

3 J. U 


Leu 


*^ys 


T 

Leu 


Lys 


Ti la 


Asn 


R9 n 

3 ^ U 










R9 R 

3^3 


Leu 


Val 


Pro 


His 


Gin 


Arg 


R'^ R' 
3 3 3 










RAD 

3 ^ U 


Jrilt= 


Leu 




a1 ja 






R R n 

3 3 W 










RRR 

3 3 3 


Lys 


Pro 


Ser 


He 


Ala 


Ala 


RfiR 

3 O 3 










R7n 

3 / V 


Ser 


z-ix y 


Tyr 


Cys 


Ala 


Thr 


R R n 

3 O W 










RRR 

3 O 3 


X X t; 


X X c 


VjXII 


Asp 


Leu 


al j=i 

xax d 


R Q R 
3-7 3 










U VJ w 


XT XXC 


lyr 




OCX 


Thr 


r\S. y 


D J_U 










f^l R 
Ox3 


TV "TTT 

JnJL y 




t^l \7 


Val 


ocsx 


fil n 


3 










D3 U 


vjX U. 


Leu 


T 

Leu 


Al » 


Ti «=a 
X X€ 


Axg 


^ A n 

o ^ u 










A R 
D ^ 3 


lyr 




±r X U 


VjXj^ 


Ti o 
xx^ 


± IIX 


A R R 
D 3 3 










O D U 


X I IX 


-riJ-y 


Leu 






Al a 


O / \J 










fi7R 
D / 3 


vjj-y 


A en 


Ti pa 

XXC= 


P^r•r^ 
irx \j 


Al a 


ni v 


^RR 
ODD 










Di/ U 


JT X \J 


±yx^ 


VjrX u. 


Phe 


A en 




ion 










7nR 

/ W 3 


Gly 


Thr 


C!<=>T~ 
OtJX 


Arg 


Pro 


Ser 


71 R 

/ X 3 










79 0 






JLilX 


Al a 


A CO 


VjX LL 


/ 3 U 










7"^ R 
/ 3 3 


nxD 


±11X 


lyr 


V cix 


r\x y 




7 AR 
/ ft3 










7Rn 

/ 3 U 


i\xa 


Tyr 


m 

Tyr 


2\ 1 a 

ii±a 




Leu 


1 r\ 

/ D U 










7 ^; R 

/ D 3 


Val 


Asp 


Lys 


Glu 


His 


Asp 


775 










780 


Gin 


Ser 


Asn 


Gly 


Arg 


Asp 


790 










795 


He 


His 


Gin 


Asp 


Thr 


Leu 


805 










810 
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Met 


Met 


Asp 


Ser 


Glu 


Asn 


Lys 


Pro 


Glu 


Asn 


Asp 


Glu 


Asp 


Glu 


Lys 


1 

JL 


















1 n 










15 


lie 


Asn 


Lys 


Glu 


Ala 

9 0 
^ \j 


Gin 


Asp 


Leu 


Thr 


Lys 

9 R 


Leu 


Ser 


Ser 


His 


Asn 

30 


Glu 


Asp 


Gly 


Gly 


Pro 


Val 


Ser 


Asp 


Val 


He 

AO 


Ala 


Ser 


Phe 


Pro 


Glu 
45 


Asn 


Ser 






Lys 








OCX 


VIT JL LI 
O D 


iDfcJX 


Ser 


Asn 


Ser 


Asp 
60 


Ser 


Val 


Val 


He 


Gly 

O Zj 


Glu 


Asp 


Arg 


Asn 


Lys 
/ u 


His 


Ala 


Ser 


Lys 


Arg 
75 


Arg 


Lys 


Leu 


Asp 


Glu 

O \J 


Ala 


Glu 


Pro 


Leu 


Lys 

R 


Ser 


Gly 


Lys 


Gin 


Gly 
90 


lie 


Cys 


Arg 


Leu 


Glu 


Thr 


Ser 


Glu 


Ser 


Ser 


Val 


Thr 


Glu 


Gly Gly 




















inn 
X u u 










THR 

X Uo 


He 


Ala 


Leu 


Asp 


Glu 


Thr 


Gly 


Lys 


Glu 


Thr 

1 1 R 
XXD 


Phe 


Leu 


Ser 


Asp 


Cys 
120 


Thr 


Val 


Gly 


Gly 


Thr 

1 9 R 


Cys 


Leu 


Pro 


Asn 


Ala 
1 "5 n 

Xo u 


Leu 


Ser 


Pro 


Ser 


Cys 
135 


Asn 


Phe 


Ser 


Thr 


He 

J- f± Lf 


Asp 


Val 


Val 


Ser 


Leu 

1 A R 
X ^ 3 


Lys 


Thr 


Asp 


Thr 


GxU 
1 RO 


Lys 


Thr 


Ser 


Ala 


Gin 

1 SR 

_L J J 


Glu 


Met 


Val 


Ser 


Leu 
1 fin 

X o u 


Asp 


Leu 


Glu 


Arg 


Glu 
165 


Ser 


Pro 


Phe 


Pro 


Pro 

17 0 

JL / U 


Lys 


Glu 


He 


Ser 


Val 

1 "VR 
X / D 


Ser 


Cys 


Thr 


He 


Gly 
180 


Asn 


Val 


Asp 


Thr 


Val 


Leu 


Lys 


Cys 


Gin 


He 


Cys 


Gly His 


Leu 


Phe 




















1 Qn 

X J/ u 










195 


O C J. 






o t: J_ 


9 n n 
^ u u 


Leu 


JL LL 


Lys 


His 


riXo. 

o n R 


VjXU 


Ser 


His 


Met 


Gin 
210 


Gin 


Pro 


Lys 


Glu 


His 

9 1 R 


Thr 


Cys 


Cys 


His 


Cys 
o o n 


Ser 


His 


Lys 


Ala 


Glu 

225 


Ser 


Ser 


Ser 


Ala 


Leu 
9 n 


His 


Met 


His 


He 


Lys 

O R 
Z D 


Gin 


Ala 


His 


Gly 


Pro 
240 


Gin 


Lys 


Val 


Phe 


Ser 
245 


Cys 


Asp 


Leu 


Cys 


Glv 
250 


Phe 


Gin 


Cys 


Ser 


Glu 

255 


Glu 


Asn 


Leu 


Leu 


Asn 

260 


Ala 


His 


Tyr 


Leu 


Gly 

265 


Lys 


Thr 


His 


Leu 


Arg 
270 


Arg 


Gin 


Asn 


Leu 


Ala 
275 


Ala 


Arg 


Gly 


Gly 


Phe 
280 


Val 


Gin 


He 


Leu 


Thr 
285 


Lys 


Gin 


Pro 


Phe 


Pro 
290 


Lys 


Lys 


Pro 


Arg 


Thr 
295 


Met 


Ala 


Thr 


Lys 


Asn 
300 


Val 


His 


Ser 


Lys 


Pro 
305 


Arg 


Thr 


Ser 


Lys 


Ser 
310 


He 


Ala 


Lys 


Asn 


Ser 
315 


Asp 


Ser 


Lys 


Gly 


Leu 
320 


Arg 


Asn 


Val 


Gly 


Ser 
325 


Thr 


Phe 


Lys 


Asp 


Phe 
330 


Arg 


Gly 


Ser 


He 


Ser 


Lys 


Gin 


Ser Gly 


Ser 


Ser 


Ser 


Glu 


Leu 


Leu 










335 










340 










345 


Val 


Glu 


Met 


Met 


Pro 
350 


Ser 


Arg 


Asn 


Thr 


Leu 
355 


Ser 


Gin 


Glu 


Val 


Glu 
360 


He 


Val 


Glu 


Glu 


His 
365 


Val 


Thr 


Ser 


Leu 


Gly 
370 


Leu 


Ala 


Gin 


Asn 


Pro 
375 


Glu 


Asn 


Gin 


Ser 


Arg 


Lys 


Leu 


Asp 


Thr 


Leu 


Val 


Thr 


Ser 


Glu 


Gly 
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Leu 


Leu 


Glu 


His 


Gly 


Asn 


Leu 


Val 


Leu 


Leu 


Lys 


Gly 


Arg 


Gly 


Thr 


Thr 


Gin 


Met 


Glu 


Ala 


Cys 


Thr 


Gin 


Glu 


Ser 


Gly 


Leu 


Thr 


Leu 


Cys 


Thr 


Asp 


Leu 


Lys 


Phe 


Tyr 


Asp 


Leu 


Asp 


Ser 


Val 


Leu 


Xle 


Asn 


Leu 


Leu 


Cys 


Thr 


Glu 


Glu 


His 


Pro 


Lys 


Thr 


Leu 


Pro 


Leu 


Asp 


Asp 


Ser 


Val 


Ser 


His 


Gin 


Cys 


Lys 


Thr 


Arg 


His 


Cys 


Lys 


Ala 


Lys 


His 


He 


Asn 


He 


Gly 


Gly 


Ala 


Asn 


Arg 


He 


Glu 


Leu 


Glu 


Lys 


Gly 


Ser 


Thr 


Arg 


Gly 


Arg 





o o u 




Lys 


Leu 


CjtXU 




"5 Q R 




Ser 


vax 


Thr 










Asn 


Ser 




AO R 
4t Z D 






x^_La 


Lys 




/I /i n 




Ser 


<CT±U 


Thr 




A R R 
4t D D 




Lys 


Thr 


TT-? /-» 

rlXS 




/I "7 n 
4t / U 




Ser 


Ser 


vax 




>] Q 




TV T _ 


LjXU 


(jrXn 




c; n n 
3 U U 




JtLJLS 


osr 


Leu 








AX a 


Cys 


Thr 




1 n 




VirXU 


TT Q 
Xx6 


rlXS 




R / R 




Cys 


Arg 


Tnr 




D b u 






iriX S 


Leu 




O / D 




Ser 


L-ys 


vjxn 




R Q n 




Arg 


Asp 


rlXS 




c n R 




Pro 


Cys 


Asn 




o n 
u z u 




Lys 


i-i_L a. 


Tnr 




bo D 




Leu 


"1 v-1 

(jxn 


Ser 




c R n 
bo u 




oer 


Thr 


Leu 




b b o 






Lys 


Ala 




con 
b o U 






Asn 


LrXU 




<^ Q R 

b y 3 




Lys 


Cys 


irne 




/ xu 




x±e 


Lys 


Leu 




•7 O R 






Asn 


Leu 








Lys 


Arg 


Ser 




"7 R R 
/DO 




Leu 


Ser 


Fne 




'7'7 n 




Asp 


Lys 


Lys 




'7 Q R 




tj±y 


llX S 


xxe 




O A O 
O U U 




Gly 


Met 


Leu 




815 




Lys 


Asp 


Asp 




830 




Pro 


Lys 


Gly 




845 





Ser Thr Lys 
Ser Arg Pro 
Phe Arg Arg 
Lys Arg Phe 
Gin Arg Met 
Asp Ala Glu 
Gin Arg Val 
Gly Gin Gly 
Thr Val Lys 
Asp Cys Gly 
Val Lys Arg 
Cys Asp Phe 
His Ser Asn 
Cys Cys Ser 
Met Lys Glu 
Leu Phe Phe 
Glu Lys His 
Ser Asn Ser 
Glu Ser Glu 
Ser Gin Glu 
Val Arg His 
Tyr Lys Thr 
Arg His Gly 
Tyr Ser Leu 
Lys His Leu 
Glu Glu Cys 
Glu Glu Phe 
Gly Val Gin 
Ala Ser Glu 
Glu Leu Ala 
Asn He Ser 



"3 Q R 






Asn 


inr 


Leu 


/inn 






Arg 


Pro 


ni n 

VJT-L U. 


yi 1 R 






Arg 


Ser 


Ser 


>i "5 n 

4t J u 






Asn 


Leu 


Leu 


/] /l R 






Tyr 


TV/Tq -f- 


Lys 


4t b U 






Ser 


vax 


Leu 


A '1 C 

4 /D 






Cys 


vax 


Thr 








Ser 


7\ "1 _ 

Axa 


Arg 


R n R 






Pro 


Ala 


Ser 


C O A 






GXn 


vax 


AXa 


C "3 C 

DO D 






Cys 


Hxs 


Tl 1 

AXa 


R R n 
DDU 






ber 


Ser 


Met 


C iZ (Z 

D bD 






(jXn 


His 


Gin 


C O A 
D oU 






pne 


xxe 


Ser 


R Q R 

Dy D 






Lys 


JbllS 


Asn 


bXU 






Leu 


Ser 


vjXU 


bz D 






xxe 


Asn 


Ser 


b4fc U 






Asp 


Leu 


vax 


C R R 
bDD 






Asn 


TV 1 •=> 

Axa 


Lys 


b / U 








Pro 


Leu 


tz o vz 
boo 






Ser 


Ser 


Lys 


■"7 A A 
/ UU 






Arg 


Ser 


Ser 


715 






(jXn 


Asp 


Tyr 


Tin 
/ o U 






Ser 


Lys 


VjXU 


•7 /I R 

/ 4D 






CjrXU 


Asn 


AXa 


/ bU 






xxe 


CjXU 


Arg 


•7 "7 R 
/ /D 






Asp 


vax 


Ser 


'1 a r\ 






Leu 


kirXn 


vjXU 


805 






Glu 


Leu 


Ser 


820 






Ser 


Thr 


Thr 


835 






Arg 


Thr 


Cys 


850 











n Q A 

oy u 




A J. a 


AX a 






/I n R 
4Ud 


Arg 


Asn 


xxe 






/ion 


Tiir 


Irne 


inr 






>i "3 n 
4o D 


kuXy 


xxe 


Lys 






/IRA 
44:3 U 


xlXS 


Leu 


Arg 






>i c c 
4oD 


Lys 


TT4 n 

xlXS 


Leu 






/? Q n 
4fcoU 


Thr 


Ser 


VaXU 






/I A C 

4y D 


Pro 


Pro 


Asp 






R 1 n 
dX(J 


Q»xy 


Ser 


VaXn 






f o tr 

525 


Thr 


Asn 


Arg 






540 


Arg 


Glu 


Met 






nr c cr 

d5d 


Ser 


Arg 


Arg 






c; T A 
D /U 


Vjxn 


Thr 


TV 1 'a 

AX a 






rr o CZ 
DOD 


Leu 


Asp 


GXU 






c n n 
b UU 


Met 


His 


Phe 






bXD 


Lys 


Asp 


Vax 






C3 A 

bo U 


Leu 


vax 


^xn 






545 


Leu 


Gin 


Thr 






/r c r\ 

660 


Glu 


Ser 


Met 






C T c 
O /D 


Lys 


Ser 


Arg 






690 


Pro 


Gin 


Phe 






705 


Thr 


vax 


Leu 






720 


His 


pne 


T «i 1 

Leu 






/o5 


CjXy 


JMLeu 


kjXU 






■7 C A 
/DU 


Lys 


Lys 


Asn 






I-? /- 1~ 

765 


Vax 


Cys 


xxe 






r-J O A 

780 


Gly 


Asn 


GXy 






795 


His 


Ser 


Tyr 






810 


Gin 


Ser 


Gly 






825 


Thr 


Pro 


Lys 






840 


Ser 


His 


Cys 






855 
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Gly 


XjSU 


Leu 


Ala Ser 








860 


Airg 


Lys 


His 


Ser His 








875 


J. jr a- 


Tvr 


Thr 


Val Thr 








890 


Lys 


Lys 


His 


Lvs GT v 








905 


Ser 


Ser 


Asp 


He He 








920 


Gly 


Lys 


Lys 


ASTl AT 








^ 'D 


Ala 


Asn 


Lys 


Pro Ala 








950 


Asp 


Airg 


Gly 


Asn Se^T" 








^ O J 


O G J_ 


Leu 




w X _Y vIT X u. 








9 80 


Gin 


lie 


SeiT 


Ser Glu 








995 


Tvir 


Sex* 


Gin 










1 ni n 

X \J X VI 




to 


Vjy J. U. 


JrXXti OtiX 








X Zj -J 


V CLJL 


j_i_y to 


xiJ. y 


to jn X to 








1 DAO 

JL \J '±\J 




Atojwi 




JLjf X .rlXCt 








1055 


Ala 


TiuT 


Glu 


to xiXto 








1070 


Ala 


Asn 


Val 


Glu Ala 








1085 




Pure 




V7X IX vJX IX 








1100 


J. X ts 


O crx 




1^X11 irX kJ 








1115 




^ Jf 


OCX 


T n T »^i 1 
X X \Z- xjou. 








1130 


V CLU. 


Leu 


Jf to 


Al a Al 3 

-rt.X Cl, XA.X CL 








1145 


O C J- 


Asn 




A can 1 1 








1 1 fin 

X X u w 


AT a 

■rt. J- Cl 


It X ^ 


V dx 


XJjf to AtoJ^ 








XX / ^ 




Tlur 


Met 


OwX VJvTi 1 








1190 


Glu 


Asn 


Seir 










1205 


Ash 


His 


Glu 


lie Ser 








1220 


Glu 


Gly 


Glu 


VJXjf VjXj^ 








1235 


His 




His 


TiF>ii Ovs 








X^ ^ \J 






Pro 


Val Tieaii 

V Cl. X XJCIX 








X ^ U «J 


Asn 


Leu 


Glu 


Ser Gly 








1280 


Leu 


Glu 


Asp 


Leu Lys 








1295 


Lys 


Glu 


lie 


Leu Met 








1310 


Glu 


Glu 


Asp 


Gly Pro 



O CsX 


He 


Ttir 


XTauJXX J_i vZ^ 








865 


Gin 


Tvr 

X J^X 


Ser 


Tyr Leu 








88 0 


Lys 


Gly 


Asp 


Met Glu 








895 


AX y 


Val 


Glu 

vJX IX 


He Glu 








910 


Val 

V CLX 


Gl v 


Pro 


Gin Gl V 
V3X IX vjj X _y 








925 


ni V 


Ser 


Ala 


Val Thr 








94.0 


fil 11 
vjx tx 




IT X 










J ~j ^ 


He 


Glu 


Ala ' 


Glu Val 








y / yj 


V cl X 


A c;-n 


OtJX 


XIX to XJtJlX 








9R R 
_7 o ^ 


Pro 


Glu 


Asp 


p"hf=k Ala 

JriJ.C^ X^X CL 








1000 


Val 


Tlir 


Gl V 


Thr* nl \7 








1 ni R 

X \-/ X J 


Ala 


His 


Ser 


Ser Ala 








X W w 


Thr 


Ti*vc« 
xi_Y o 


Glu 


Phfi Gil] 

Jl JLi~ vj X IX 








J.\J 'dk ~J 


V CLX 


± JLIX 


nX y 


-cAX y o X IX 








1060 


Lys 


Met 


Lys 


Argr Gin 








1075 


vJX_y 


Ser 


Ala 










1 090 

X u 


His 


Gin 


Gin 


-TAiO 1 i. OCX 








1105 


Ser 


Asp 


TJnr 


J—/ Vi- J— I y *^ 








112 0 


Asn 


Glu 


Asn 


Thy* Acsn 








X X O -J 




Ser 


Val 


Gl 11 Va 1 

\J>L.WX V CLJ^ 








1150 




JTiX to 


OCX 


JrXlC to 








1 1 S 

J — L \J ~J 


Lys 


Val 


AT"fT 
"X y 


XJ jr to IT X 








X X o w 


Asn 


T\7"T" 


Gl V 


i3CX Ju X w 








X X -7 ^ 


Ser 


Ala 


Leu 


A c;n Pvc! 

f^toXX O 








X ^ X w 


Asn 


Asp 


Ala 


Glv Glu 








1225 


Asn 


Ala 


Gly 


Ac5in> Gl \r 








1 940 


Pro 


Val 


Tlnr 


XJCIX iT-O 








1 9^^ 

X^ «J •J 


Val 

V CLX 


Val 

V CLX 


J. XIX 


Avrr Tl o 
x^x y X X c 








X ^ / w 


Glv 


Gin 


Asn 


Arg Val 








1285 


Gly 


Val 


Gin 


Glu Asp 








1300 


Asn 


Ser 


Gin 


His Glu 








1315 


Ala 


Ser 


Asp 


Ser Thr 



Thr 


Val 


His 


He Arg 








870 


Cys 


Lys 


Val 


Cys Lys 








885 


Arg" 


His 


Cys 


Ala Thr 








900 


Ala 


Ser 


Glv 


Lys Hi s 








915 


Glv 


Ser 


Leu 


Glu Ala 








930 


Met 


Ser 


Asp 


Glu His 








945 


Val 


Leu 


Glu 


Lys Pro 








960 


Glu 


Asn 


Val 


Phe His 








975 


Leu 


Asp 


Lys 


T,VC3 nl 11 
xj_y o \j X IX 








990 


Gin 


Pro 


Gly 


Asp Val 








1005 


Glu 


Asn 


Lys 


Pvs Leu 








1020 


Ser 


Leu 


Glu 


XJ^?U> XXiX |9 








1035 


Phe 


i jr X 


Cys 


Met Ala 

X X^5 iJ. -U M. 








1050 


Met 


Thr 


AX. y 


His Ala 








1065 


Ser 


Tvr 


Leu 


Asn Ser 








1080 


Ser 


Lys 


Asn 


He He 








1095 


Glu 


Glu 


Phe 


Gin He 








1110 


Ser 


Arg 


Asn 


Al a Ala 








1125 


Leu 


Asp 


Met 


Cp»7- Tiva 








1140 


Glu 


Thr 


Glu 


Glu Glu 








1155 


Glu 


Thr 


Phe 


Gin Gin 








1170 


Glu 


Glu 


Met 


Met Ser 








1185 


Ser 


Arg 


Phe 


Gin Asn 








1200 


Glu 


Thr 


Ala 


Lys Lys 








1215 


Leu 


Arg 


Val 


His Cys 








1230 


Glv 


Glv 


Val 


Val Pro 








1245 


Gly 


Glu 


Arg 


Ser Ala 








1260 


X XXX 


AT^fT 

jHix y 


Gl 11 

VJX IX 


Gl n Gl V 

V3XXX virxjr 








1275 


Ala 


Arg 


Glv 

vjx.jr 


His Gly 








1290 


Pro 


Val 


Leu 


Gly Asn 








1305 


Thr 


Glu 


Phe 


He Leu 








1320 


Val 


Glu 


Ser 


Ser Asp 
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1 1 9 

-1- «6 J 


Val 




Glu 


ThjC He 








1340 


X a. 




Phe 


Gly Aarg" 








1355 


P270 


Glu 


Asp 


Glv Glu 








1370 


Ala 




Gly 










13 85 


Gin 


Gly 


Val 










1400 


0 1; j_ 


Vi T~ 


A J. y 


JL JL C -ti.J_ y 










JLj6U, 














J- *± .J 


Glu 


Lys 


His 










JL ^ ^ J 


U. 




Asn 










J- *± 0 w 




Glu 


m -n 


J_ CI, OvISX. 








J- 4 / wl 




Lys 


i_.ys 










JL*±J\J 


Leu 


Pile 


Leu 


His He 








J- -J W J 


Val 


Asn 


Lys 


Tyir He 
























_L ^ 0 .J 


Cys 


Adtq" 


Se3r 


»-> J-. x^o±x 








1550 


His 




Gly 










1565 


X IJJL 


Ala 












_L J 0 \J 






AjrQ" 










1 RQS 
J — ' _? J 








XJ. J- 0 XJCiU. 








X 0 J_ VJ 






ir I/O 


Xjyfo *orJLU 








-L D ^ -J 




Phe 


X 1 i,J- 


vj-L IX J~j_y 0 








0 tt VJ 


Tin Y* 


Vj J- jf 




TArc! Pt~o 












Pine* 


XJCSU. 


XXXX. f^XCL 








J. 0 / \J 


X XJLJ. 




m 11 

V3JL U. 










1685 


Gly 




AjTCf 


His Ala 










VJ-L Li 


Lys 


P^o 










1 71 


m n 

V7X1J. 


0 C. J- 


His 


XJCIX xxxx 








1 7*^ n 


Piro 


lyx 


Aircf 


^yfc> Jrx 0 








1 7AR 


Asn 


He 


Arg 


Lys His 








1760 


Lys 


Met 


Tyr 


Asn Cys 








1775 


Val 


Glu 


Phe 


Arg Asn 








1790 









133 0 


JL JL6 


O tiX 


Tl fa 

X X C 


A ClT^ A GT^ 








X J rr-J 


irxxc 




OCX 


^^T" Tl e> 

wCX XX c 








X J 0 u 


Leu 


He 


Asp 


Gin Ser 








X 0 / 


_L J_ fc: 


OSX 


IjX u. 


XJciLl stS-kJ 








1390 


Lys 


Lys 


Ser 


niii fil V 








X *± U 


i—ys 


A a-r-\ 


Asp 


^ys vjxy 








x^z u 


Vo. JL 


XIX S 


Tl 0 
XX€ 


Al sa IVToi- 








X 'aO ^ 


k-.yfa 




Leu 










Xf± 3 U 


V7 JLXX 


XXX D 


1 

XJCLX 










X ^ D 0 


Va.x 


m n 

orXU 


krrXU 


xieu. JrxO 








X *± 0 w 




X XXX 


VjX LL 


XrX jrXXt: 








1 A Q R 


Lys 


JLy 


vjXll 










X -J X u 


Val 


vjX Li 


Asp 


Thr Glu 








X -J ^ -J 






Val 


v-»jf to XI _y 0 








1 RAn 

X J *3; W 


Ser 


Met 


Ala 


PT^f^ Trf^ll 








J_ «J J 


P3ro 


Jrxie 


Lys 


v^ys juys 








1 S7n 




AT « 


x*a.X y 


Atoxx rxx 0 








1 RRR 






nx 0 


"^ra T r'\7-c; 
Va.X k^Yo 








X Q U U 


Aon 


X XXX 


His 


XJCSLX XJCZLX 








X D X-J 


AlTQ" 


Lys 


Pino 


XXXX k^^to 








X D 0 U 


ixp 


AT 


Leu 


Aon A an 








X 0 ^3 


PVlfa 
irXXO 


xiy to 




X XXX X X 








X 0 0 u 


iJ CiX 


AT « 


X*XC L. 


xj_y to Ato f 








1 

JL\J / ^ 


PViea 
IrXXC? 


■Lieu 




jntoj^ XJ ts u. 








X D VJ 


Leu 


Tlir 


Lys 


H"] Atct 








1 7ns 

X / U J 






crX U 










X / ^ u 


jfix y 


nx 0 


r Arc* 


ax y V CL X 








1 7"^ R 
X / .5 D 


m 

Trp 


^^ys 


Asp 


X yx /ix y 








1 "7 R n 

X / D U 


He 


Leu 


His 


Thr Gly 








1765 


Pro 


Lys 


Cys 


Asp Tyr 








1780 


His 


Leu 


Lys 


Glu Gin 








1795 



1335 



Lys 


Gly 


Gin 


Ala Met 








1350 


He 


Arcf 


He 


TiVF! Acsn 








1365 


Glu 


Glu 


Gly 


Leu Tlf^ 








1380 


Leu 




Asp 


Cvs Ala 








1395 


Ser 


Ser 


He 


niv Rlu 








1410 


PViis 

XrXXC 


XJc: \JL 


AT « 


A ci'o (^T "vr 








1425 


Lys 


His 


Pro 










1440 


xjy to 


OCX 


XT XXC 


Xjf X XXXX 








1 ARR 

X ^ -J -J 


AT P\ 


fJT \T 


xxx to 


XMc I- r-iX y 








1470 


V7X IX 


nT ^7 
vjx y 


RT V 


x^xd xxxx 








1485 


A GT^ 


OtiX 


nT n 

V^X LX 


\>7XXX jratoXX 








1500 


Glu 


Leu 


Leu 


A "Y^rx CZl ^'\ 
ir\S- y A- \Jl 








1515 


Gin 


He 


Asn 


Ar^rr rJT 1 1 
x^x y w -1- LX 








1530 




V_#y to 


nT \7- 

wX Y 


XJjf to X%LC U. 








1545 


AT ^5 


rxx 0 


Tle> 
xxc 


Arcf Thr 








1560 


He 


Cys 


His 


p"hc> AT a 








1575 


Val 


Lys 


■"J- y 


XXXO xjc::lx 








159 0 


C^T -vr 


V CLX 


AT 


p"h ^ \Tf^ T 








1605 


Gly 


Lys 


Hi Q 
XXX to 


mv VaT 








1620 


xaX to 


_ 

Leu 


to 


A en A'K'rY 
i*i.to^ jrt.x y 








1635 


nx to 




Lys 


XJCLX XXXto 








1650 


IrX (J 


Tin 
X XXX 


r^\7" cs 

V^jf to 


XXXto ±^J. 








X 0 Q ^ 


XXX to 




xTiX y 


'T'ViT* TTi cs 

X XXX XXX to 








1680 


^ to 


nT \r 


JTXXc: 


AT R m -v 








1695 


Avrr 
■ciXy 


fST n 

VzrXXX 


W-i cs 

XXX to 


TTtk" (^T "xr 

xxxx yjJ.^ 








1710 


IrXXc: 


AT a 


OCX 


xxxx xxxx 








1 79R 


XIX to 


xxxx 


nT \7- 


f^T n T Arcs 
VjXLl XJ Jf to 








1740 


Ser 


Asn 


Cys 


Ala Glu 








1755 


Lys 


His 


Glu 


Gly Val 








1770 


Gly 


Thr 


Asn 


Val Pro 








1785 


His 


Pro 


Asp 


He Glu 



1800 
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Asn 


Piro 


Asp 


i-iSu jrt.xa 










Tyr 


CrrlU 


Cys 


Arg Leu 








1 Q o n 


Asp 


Se3r 


Pro 


Jrne i nr 








1 Q *5 C 


Lys 


Glu 


Lys 


Pro Leu 








1 o c n 
loo U 


CjXU 


(jin 


Val 


(jin (j±n 








1 o ^ c: 
lo DO 


Phe 


Ala 


Leu 


Asp Pro 








1 o o n 


Gin 


Tmr 


Leu 


Ala Met 








1 o o c 

1895 




(jlU 


Asp 


Gly Gin 








1910 


Val 


Gly 


Ser 


Val Val 








1925 


Asp 


Gly 


Ala 


Thr Gin 








194U 


His 


Gly 


Met 


Asp Glu 








1 o c c 

19 Db 




val 


Tlir 


Lys Gin 








19 / U 


Ala 


Pro 


Pro 


Glu Ala 








^ Q O C 

19 OD 


Val 


Tnr 


Glu 


Leu Gly 








o r» r» 


Gin 


Gly 


Arg 


Pro Gly 








^Ulo 


Gin 


Glu 


Val 


Ser His 








O O O A 




Meu. 


Pile 


Fro 








2045 


Leu 




Gin 


Val Val 








2060 


Arg 


Ala 


Gin 


Val Ala 








2075 


Phe 


Ala 


Val 


Cys Asp 








2090 


Gly 


Val 


Thr 


Gin Val 








2105 


Val 


Ala 


Gly 


Glu Urly 








2120 


Glu 


Hxs 


Met 


Asp Leu 








2135 


lie 


Val 


Thr 


Glu Glu 








2150 


Gly 


Gly 


Phe 


Ser Glu 








21bb 


Pro 


Pro 


Giy 


Val Gin 








ZloU 


Leu 


CjlU 


Thr 


Ala Asp 








219b 


Leu 


Gly 


Thr 


Glu Ala 








2210 


Ser 


Val 


Val 


He Tyr 








2225 


He 


Gin 


Ser 


Gin Arg 








2240 



<210> 33 
<211> 256 



Tyr 


Leu 


XllS 


i\±a vcj±y 








1 O 1 A 

lolU 


Lys 


Gly 


Gin 


Gly Ala 








1 o o c 


Ala 


Ala 


TV 1 — « 

Ala 


Leu Ala 








1 Q yi A 
lc54U 


Arg 


Ser 


Ser 


Arg Arg 








1 o c c 

lobb 


Val 


He 


He 


Pne Gin 








1 O A 

lo / U 


Ser 


Val 


Glu 


Glu Thr 








■1 O O IT 

1885 


Ala 


Gly 


Gin 


Val Ala 








1 Q A A 
1900 


Val 


lie 


Ala 


Thr Ser 








1915 


Pro 


Gly 


Pro 


He Leu 








193 0 


Val 


Val 


Val 


Val Gly 








1 A y1 r 

1945 


Ser 


Leu 


Ser 


Pro Gly 








"1 A A 

19dU 


Glu 


lie 


Leu 


Asn Leu 








1 A T C 

19 /b 


Ser 


Ser 


Ala 


Leu Asp 








1 Q O A 

19 9 U 


Glu 


vai 


Glu 


Gly Arg 








O A A C 


Ala 


Lys 


Asp 


Val Leu 








o n o A 


Val 


TV "1 _ 

Ala 


Ala 


Asp Pro 








2 03 5 


Ala 


Gin 


Glu 


Ser Pro 








2 050 


His 


Pro 


Ser 


Ala Ala 








2 055 


Phe 


Lys 


Lys 


Met Val 








2080 


Thr 


Ala 


Ala 


Ala Gly 








2095 


Val 


Val 


Ser 


Glu Glu 








2110 


Ala 


Gin 


He 


He Met 








A 1 O IT 

212 b 


Val 


Glu 


Ser 


Asp Gly 








O 1. >1 A 

2140 


Leu 


Val 


Gin 


Ala Met 








2155 


Gly 


Thr 


Thr 


Hi s Tyr 








2170 


Asp 


Glu 


Pro 


Gly Leu 








A 1 O IT 

2185 


Ser 


Gin 


Glu 


Leu Leu 








2200 


Gly 


Ala 


Pro 


Ser Arg 








2215 


Thr 


Gin 


Glu 


Gly Ser 








2230 


Glu 


Ser 


Ser 


Glu Leu 








2245 



lie 


Val 


Ser 


Lys Ser 








1 o -I c 

lolb 


Tnr 


Jrne 


vaj- 


^-Lu inr 








1830 


Glu 


Glu 


Pro 


Leu Val 








lo4b 


Pro 


Ala 


Pro 


Pro Pro 








1 O <r A 

lobU 


Gly 


Tyr 


Asp 


Gly Glu 








lo /b 


Ala 


Ala 


TV -J _ 

Ala 


Thr Leu 








1 O A A 

1890 


Arg 


Val 


Val 


His He 








1905 


Gin 


Ser 


Gly 


Ala His 








192 0 


Pro 


Glu 


Gin 


Leu Ala 








1935 


Gly 


Ser 


Met 


Glu Gly 








195 0 


Gly 


•TV T «, 

Ala 


Val 


lie Gin 








19ob 


Ser 


Glu 


TV 1 _ 

Ala 


Gly Val 








1980 


Ala 


Leu 


Leu 


Cys Ala 








1 A A cr 

1995 


TV T 

Ala 


Gly 


Leu 


Glu Glu 








O A T A 
2 UlU 


He 


Gin 


Leu 


Pro Gly 








O AO C 

2 02b 


Glu 


TV T 

Ala 


Pro 


Glu He 








O A /I A 

2u4u 


Ala 


Ala 


Val 


Glu Val 








A A c cr 

2 Obb 


Met 


Ala 


Ser 


Gin Glu 








207 0 


Gin 


Gly 


Val 


Leu Gin 








2085 


Gin 


Leu 


Val 


Lys Asp 








O 1 A A 

2100 


Gly 


Ala 


Val 


His Met 








A 1 1 IT 

211b 


Gin 


Glu 


■TV T 

Ala 


Gin Gly 








213 0 


Glu 


He 


Ser 


Gin He 








2145 


Val 


Gin 


Glu 


Ser Ser 








2160 


He 


Leu 


Thr 


Glu Leu 








2175 


Tyr 


Ser 


His 


Tnr Val 








2190 


Gin 


Ala 


Gly 


Ala Thr 








2205 


Ala 


Glu 


Gin 


Leu Ala 








2220 


Ser 


Ala 


Ala 


Ala Ala 








2235 


Gin 


Glu 


Ala 
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<212> PRT 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 2801633CD1 

<400> 33 





trlU 


Leu 


Leu 


inr 


Pxie 


Lys 


Asp Val 


TV n ^ 
Ala 


lie 


Glu 


Phe 


Ser 


Pro 


1 








rr 
D 










10 










15 




/^T 11 
VjlU 


Trp 


Lys 


Cys 


Leu 


Asp 


He 


Ser 


(jf±n 


tirin 


Asn 


Leu 


Tyr 


Arg 










20 










O IT 

25 










30 


Asp 


Val 


Met 


Leu 


Glu 


Asn 


Tyr 


Arg 


Asn 


Leu 


Val 


Ser 


Leu Gly Val 










35 










40 










45 


Thr 


He 


Ser 


Asn 


Pro 


Asp 


Leu 


Val 


Thr 


Ser 


Leu 


Glu 


Gin 


Arg 


Lys 










50 










55 










60 


Glu 


Pro 


Tyr 


Asn 


Leu 


Lys 


He 


His 


Glu 


Thr 


Ala 


Ala 


Arg 


Pro 


Pro 










65 










/ 0 










75 


Ala 


Val 


Cys 


Ser 


His 


Phe 


Thr 


Gin 


Asn 


Leu 


Trp 


Thr 


Val 


Gin 


Gly 










80 










85 










90 


xxe 


Glu 


Asp 


Ser 


Phe 


His 


Lys 


Leu 


He 


Pro 


Lys 


Gly His 


Glu 


Lys 




















100 










105 


Arg 


Gly 


His 


Glu 


Asn 


Leu 


Arg 


Lys 


Thr 


Cys 


Lys 


Ser 


He 


Asn 


Glu 










110 










115 










120 


Cys 


Lys 


vai 


(jrln 


Lys 


Gly Gly 


Tyr 


Asn 


Arg 


He 


Asn 


Gin 


Cys 


Leu 










125 










130 










135 


Leu 


Txir 


Tnr 


Gin 


Lys 


Lys 


Thr 


He 


Gin 


Ser 


Asn 


He 


Cys 


Val 


Lys 










140 










145 










150 


Val 


Phe 


His 


Lys 


Phe 


Ser 


Asn 


Ser 


Asn 


Lys 


Asp 


Lys 


He 


Arg 


Tyr 










155 










160 










165 


Thr 


Gly 


Asp 


Lys 


Thr 


Phe 


Lys 


Cys 


Lys 


Glu 


Cys 


Gly 


Lys 


Ser 


Phe 










170 










175 










180 


His 


Val 


Leu 


Ser 


Arg 


Leu 


Thr 


Gin 


His 


Lys 


Arg 


He 


His 


Thr 


Gly 










185 










190 










195 


Glu 


Asn 


Pro 


Tyr 


Thr 


Cys 


Glu 


Glu 


Cys 


Gly Lys 


Ala 


Phe 


Asn 


Trp 










200 










205 










210 


Ser 


Ser 


He 


Leu 


Thr 


Lys 


His 


Lys 


Arg 


He 


His 


Ala 


Arg 


Glu 


Lys 










215 










220 










225 


Phe 


Tyr 


Lys 


Cys 


Glu 


Glu 


Cys 


Gly Lys 


Gly 


Phe 


Thr 


Arg 


Ser 


Ser 










230 










235 










240 


His 


Leu 


Thr 


Lys 


His 


Lys 


Arg 


He 


His 


Thr 


Gly 


Glu 


Lys 


Leu 


Tyr 










245 










250 










255 



Thr 



<210> 34 
<211> 615 
<212> PRT 

<213> Homo sapiens 
<220> 

<221> misc_feature 

<223> Incyte ID No: 7493525CD1 

<400> 34 



Met Asp 


Asp 


Leu 


Lys 


Tyr 


Gly Val Tyr 


Pro 


Leu 


Lys 


Glu 


Ala 


Ser 


1 






5 






10 










15 


Gly Cys 


Pro 


Gly 


Ala 


Glu 


Arg Asn Leu 


Leu 


Val 


Tyr 


Ser 


Tyr 


Phe 








20 






25 










30 


Glu Lys 


Glu 


Thr 


Leu 


Thr 


Phe Arg Asp 


Val 


Ala 


He 


Glu 


Phe 


Ser 








35 






40 










45 


Leu Glu 


Glu 


Tzrp 


Glu 


Cys 


Leu Asn Pro 


Ala 


Gin 


Gin 


Asn 


Leu 


Tyr 








50 






55 










60 
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Met Asn Val 
Gly Val Ala 
Glu Lys Glu 
Pro Pro Ala 
Gin Asp lie 
Gly Lys Cys 
Ser Val Asp 
Asn Gin Cys 
Lys Tyr Val 
Lys Thr Arg 
Asp Glu Ser 
He His He 
Ala Phe Lys 
Thr Gly Glu 
Lys His Ser 
Glu Lys Pro 
Ser Ser His 
Pro Phe Lys 
Ala Leu Thr 
Lys Cys Glu 
Thr Lys His 
Glu Gin Cys 
His Arg Arg 
Cys Gly Lys 
Met He His 
Lys Ala Phe 
His Thr Gly 
Phe Asn Gin 
Gly Glu Lys 
Arg Ser Ser 
Lys Pro Tyr 
Ser Asn Leu 



Met 


Leu 


Glu 




D D 




vajL 


Ser 


Lys 




oU 




Pro 


Trp 


Asn 








jyieu 


Cys 


Ser 




llU 




Lys 


Asp 


Ser 




IOC 

Iz b 




ij±U 


JdlS 


UlU 




140 




Glu 


Tyr 


Lys 




-1 r- [- 

lb D 




Leu 


Thr 


Thr 




170 




Lys 


Val 


Phe 




185 




His 


Thr 


Gly 




200 




Phe 


Cys 


Met 




215 




Arg 


Glu 


Asn 




o o r\ 




Trp 


Fne 


Ser 




245 




Lys 


Pro 


Phe 




2 60 




Ser 


Thr 


Leu 




'"i '~j r~ 

A /b 




Tyr 


Arg 


Cys 




o o r\ 




Leu 


Thr 


Thr 




3 05 




Cys 


Glu 


1 1 

CjIU 




320 




Thr 


TT A 


Lys 




335 




Glu 


Cys 


Asp 




350 




Lys 


He 


He 




3 65 




Gly 


Lys 


Gly 




3 80 




He 


His 


Thr 




395 




Ala 


Phe 


Asn 




41(J 




Thr 


Gly 


Glu 




4zb 




Asn 


His 


Ser 




A A f\ 

440 




Glu 


Lys 


Pro 




455 




Ser 


Ser 


Asn 




470 




Leu 


Tyr 


Lys 




485 




Asn 


Leu 


Thr 




500 




Lys 


Cys 


Glu 




515 




Thr 


Lys 


His 



Asn 


Tyr 


Lys 


Gin 


Asp 


Pro 


Met 


Lys 


Arg 


Tyr 


Phe 


Thr 


Phe 


Gin 


Gin 


Asn 


Leu 


Gin 


Val 


His 


Lys 


Thr 


Gin 


Ser 


His 


Lys 


Phe 


Glu 


Lys 


Pro 


Leu 


Leu 


His 


Ser 


Tyr 


Gin 


Thr 


Leu 


Thr 


Lys 


Cys 


Glu 


Thr 


Thr 


His 


Glu 


Glu 


Cys 


His 


Lys 


Val 


Cys 


Gly 


Lys 


Phe 


He 


His 


Lys 


Ala 


Phe 


His 


Ser 


Gly 


Phe 


Asn 


Trp 


Gly 


Glu 


Lys 


Val 


Ser 


Ser 


Lys 


Pro 


Tyr 


Ser 


Lys 


Leu 


Tyr 


Lys 


Cys 


Leu 


Thr 


Lys 


Cys 


Glu 


Glu 


Thr 


His 


Lys 


Glu 


Cys 


Gly 


Asn 


He 


He 



Asn 


Leu 


V CLX 


/ u 






vai 


Tnr 


Cys 


oo 






Wn cs 
Xx-L O 






1 no 






T Arc 






lie; 
lib 






V ci J- 


J- _L t; 


T>oi 1 
J-Jti U. 


1 J u 






Leu 


Arg 


Lys 


1 A K 






vtJ.U 


Pi 
vjxy 


±yjL 


lb u 






Lys 


lie 


Jrne 


1 /b 






Leu 


Asn 


7\ 1 ^ 

Ala 


ly U 






Jriie 


Lys 


^ys 


one 






Leu 


Ser 




o o n 






Cys 


VjlU 


olU 


O "3 c; 
Add 






Arg 


XlX D 


Lys 


o c: n 
zb U 






CjlU 


Cys 


Qjiy 


OCR 

z D b 






Lys 


jMet- 


lie 


o Q n 
Z o u 






vjiy 


T 

Lys 


AX a 


O Q c: 

z y b 






i±e 


m s 


Tnr 


b 1 U 






Aia 


jrne 


Asn 


"3 O R 






Val 


Lys 


1 1 

(jIU 


"3 yi n 






Asn 


Arg 


irne 


"3 C n 

b bb 






(jlU 


Lys 


Ser 


Q ""z n 
b / u 






Ser 


Ser 


Tnr 


bob 






Pro 


Tyr 


Lys 


4UU 






riis 


Leu 


Tnr 








Lys 




\jX u. 


yi n 
^tb u 






Thr 




jnjL o 


yi / R 
44b 






ijlU 


VjIU 


Cys 


4bU 






His 


Lys 


lie 


4 / b 






K-y to 




T 

J-ijf o 


490 






Arg 


He 


His 


505 






Lys 


Ala 


Phe 


520 






His 


Thr 


Gly 





Leu 


Ala 






/ D 


Leu 










y u 


V ax 


Asp 


VjXU 






1 OR 


m 

Trp 


Pro 


VjXU 






1 9 n 
xz u 


Arg 


Arg 


Tyr 






Xo D 


vjiy 


Ser 


Ala 

AX a 






1 

XD U 


Asn 




Leu 






1 R 
X Q b 


Pro 


Cys 


Asp 






ion 
lo U 


Asn 


Arg 


xixs 






1 Q c; 
xy b 


Lys 


Lys 


Cys 






o 1 n 

z 1 U 


His 


Lys 


Arg 






o o c 

zzb 


Cys 


taiy 


Lys 






Z4feU 


Arg 


Tl o 

lie 


riXS 






ore: 
z b b 


Lys 


AT a 

jAxa 








on r\ 
z / u 


iIjLS 


Thr 


taiy 






O P R 
Z OO 


TSVi » 
irlie 


lyr 


nxo 






•5 n n 

o u u 


v=riy 


VjtXU 


Lys 






bib 


XlXo 


Pro 








n 

b b u 


Lys 


Pro 


Tyr 






b4b 


Ser 


Tyr 


Leu 






b D U 


Tyr 


Lys 


Cys 






O T c 

b /b 


Leu 


Thr 


Lys 






o Q rk 

by u 


Cys 


1 1 

LjtIU 


(jrlU 






4LUb 


Tnr 


±11 S 


Lys 






/on 

4I:Z U 


CjlU 


Cys 


Vjiy 






/I "3 R 

4tb b 


Lys 


Tl tr^ 

lie 


Tl ITS. 

lie 






A c r\ 
4bU 


Gly 


Lys 


TV 1 -a 

Aia 






/ICR 

4ob 


He 


His 


Thr 






/ion 
4 o U 


■Ala 

Aia 




Asn 






495 


Thr 


Gly 


Glu 






510 


Asn 


Arg 


Ser 






525 


Glu 


Lys 


Ser 



47/78 



wo 03/006618 



PCT/US02/21971 











530 






535 


RAO 


Tvir 


Lys 


Cys 


Glu 


Glu 


Cys 


Gly Lys Ala Phe Asn Gin Ser Ser 


J. J.1JL 










545 






550 


^ ZJ ~J 


Leu 


Thr 


Lys 


His 


Arg 


Lys 


lie 


Gin Gin Gly Met Val Ala His 


Ala 










560 






565 


R7n 
^ / \j 


Cys 


Asn 


Pro 


Asn 


Thr 


Leu 


Arg 


Gly Leu Gly Glu Gin. He Ala 


Arg 










575 






580 


585 


Ser 


Gly Val 


Gin Asp 


Gin 


Pro 


Gly Gin His Gly Lys Thr Pro 


Ser 










590 






595 


600 


Leu 


Leu 


Lys 


lie 


Gin 


Lys 


Phe 


Ala Gly Cys Gly Gly Arg Arg 


Leu 










605 






610 


615 



<210> 35 

<211> 418 

<212> PRT 

<213> Homo sapiens 

<220> 

<221> inisc_f eature 

<223> Incyte ID No: 7021892CD1 

<400> 35 



Met 


Leu 


Leu 


V dJL 


A J- d 


i. IIJL 


JL _L 


Leu 


Asp 


Ser 


Pro 


Gly 


Leu 


Asp 


Pro 


1 








5 










10 












Phe 


j._y J- 


His 


Phe 


Asn 


O G J- 




Ser 


Asn 


Asn 


Glu 


Arg 


Tyr 


Ser 


Ser 










20 










25 












Phe 


Gin 


Leu 


Ala 






J — L ti 


Gin 


Lys 


Ser 


Ala 


Ala 


Met 


Thr 


Leu 










35 










40 












His 


Val 


Cys 


Thr 








Trp 


Tyr 


Lys 


Gly Tyr His 


He 


vaj_ 










50 










55 










fin 


Glv 


Lys 


Asn 


Leu 






OCX. 


Asn 


Asn 


Leu 


Asn Asp 


Gly Arg 


jxiet 










65 










70 










75 


Lys 


Ser 


Glu 


Ser 


Asp 


Trp 


He 


Lys 


Lys 


Glu 


Gly Lys 


Gly Val 


Ala 










80 










85 










90 


Lys 


Val 


Gly 


Gly 


Asp 


Thr 


Leu 


Trp 


Tyr 


Lys 


Ser 


Pro 


Trp 


Gin 


Ala 










95 










100 










105 


Ala 


Leu 


Thr 


Pro 


Asp 


Leu 


Ser 


Cys 


Pro 


Gin 


Lys 


Gin 


Leu 


Glu 


Ala 










110 










115 










120 


Arg 


Gly 


Glu 


Thr 


Pro 


Glu 


Gly 


Glu 


Thr 


Phe 


Ala 


Met 


Ala 


Glu 


His 










125 










130 










135 


Phe 


Lys 


Gin 


He 


He 


Arg 


Cys 


Pro 


Val 


Cys 


Leu 


Lys 


Asp 


Leu 


Glu 










140 










145 










150 


Glu 


Ala 


Val 


Gin 


Leu 


Lys 


Cys 


Gly Tyr Ala 


Cys 


Cys 


Leu 


Gin 


Cys 










155 










160 










165 


Leu 


Asn 


Ser 


Leu 


Gin 


Lys 


Glu 


Pro 


Asp 


Gly Glu 


Gly Leu Leu 


Cys 










170 










175 










180 


Arg 


Phe 


Cys 


Ser 


Val 


Val 


Ser 


Gin 


Lys 


Asp 


Asp 


He 


Lys 


Pro 


Lys 










185 










190 










195 


Tyr 


Lys 


Leu 


Arg 


Ala 


Leu 


Val 


Ser 


He 


He 


Lys 


Glu 


Leu 


Glu 


Pro 










200 










205 










210 


Lys 


Leu 


Lys 


Ser 


Val 


Leu 


Thr 


Met 


Asn 


Pro 


Arg Met 


Arg 


Lys 


Phe 










215 










220 










225 


Gin 


Val 


Asp 


Met 


Thr 


Phe 


Asp 


Val 


Asp 


Thr 


Ala 


Asn 


Asn 


Tyr 


Leu 










230 










235 










240 


He 


He 


Ser 


Glu 


Asp 


Leu 


Arg 


Ser 


Phe 


Arg 


Ser 


Gly Asp 


Leu 


Ser 










245 










250 










255 


Gin 


Asn 


Arg 


Lys 


Glu 


Gin 


Ala 


Glu 


Arg 


Phe 


Asp 


Thr 


Ala 


Leu 


Cys 










260 










265 










270 


Val 


Leu 


Gly 


Thr 


Pro 


Arg 


Phe 


Thr 


Ser Gly Arg His 


Tyr 


Trp 


Glu 










275 










280 










285 


Val 


Asp 


Val 


Gly 


Thr 


Ser 


Gin 


Val 


Trp Asp Val 


Gly val Cys 


Lys 



290 295 300 
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Glu 


Ser 


V Six 


Asn 


Arg 
"5 n R 






Gly 


Phe 


Leu 


Thr 


Val 

o n 
o ^ u 


Gly 


Cys 


Ser 


Thr 


Val 


Pro 


Met 

•5 tr 


Thr 


Pro 


Arg Val 


Gly 


He 


Phe 


Leu 


Asp 










c; n 
0 D U 






Tyr 


Asn 


Val 


Ser 


Asp 

0 d 

ODD 


Gly 


Cys 


Pro 


Val 


Cys 


Glu 


Pro 
380 


Trp 


Arg 


Ser 


Gin 


Asp 


Asp 


Gin 

395 


Ser 


He 


Pro 


Ser 


Ala 


Ala 


Ser 
410 


Ala 


Pro 



Lys 


He 


Val 


Leu 






\ZJ-L U. 








310 










1 


Arg 


Glu 


Gly 


Lys 


V d J. 


xrXics 




Z-i— L Ct 






325 












Leu 


Trp 


Val 










WH cs 

Xj. J- 0 






340 










0 '± 


vaj. 


Gly Met 


Arg 


otsr 


_L j_ t; 










355 










0 0 u 




He 


Tyr 


Thr 


Phe 


He 


Glu 


He 






370 










375 


Pro 


Phe 


Phe 


Ala 


His 


Lys 


Arg 


Gly 






385 










390 


Leu 


Ser 


He 


Cys 


Ser 


Val 


He 


Asn 






400 










405 


Val 


Ser 


Ser 


Glu 


Gly 


i^ys 







415 



<210> 36 

<211> 1010 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 7492673CB1 



<400> 36 

atggcggatg 

gtgtgggcgc 

ggtttcagca 

cgcggccgtg 

aaggagcgga 

ctagaggaga 

ctgggggcct 

gccggccagc 

ggtgggttga 

aagctctcca 

accgtccctt 

cccaggggca 

atctacgact 

gccaccttgg 

gagactgtat 

accagagtct 

gcaagaaaaa 



acgccagtgc 
cggaggcccc 

gtggcatccg 
gatggggccg 
tgcccgtcac 
tttatctctt 
ctctcaagga 
gcaccaggtt 
agtgctccaa 
ttgtcaccgt 
gcaaggtgac 
ctggcatcgt 
gctacacctc 
atgctgtctc 
tcaccaagtc 
ccgtgcagag 
taaagtgaat 



agagggcggg 

gggggccctg 

ggtccggggt 
gggccaagac 
caagctggac 
ctcgctgccc 
cgaggttttg 
caaggcgttt 

ggaggtggcc 

gcgcagaggc 
aggccgctgc 
ctcagctcct 
agccaggggc 
taacatctac 
tccctatcag 
gacccaggct 
taaagcctat 



ggcgggcggg 
ggaaggggaa 
cacggccgtg 
cgcggagttc 
ggcctggcca 
atcaaggaat 
aagattatgc 
gttgccatca 
gccgccatcc 
tactggggga 
ggctctgtgc 
gtgcccaaga 
tgcactgcca 
agctacctgc 
gaattcactg 
ccagctgtga 
tactgtcaaa 



cgcgggcggg 
ctgatgtgac 
gatggggccg 
gcggaggcaa 
aggacatgaa 
ctgagatcat 
ctgtgcggaa 
gggactacaa 
gcggggccat 
acaagatcgg 
tggtgccctt 
agctgctcat 
ccctgggcaa 
ccctgaagga 
accacctcgc 
ctacaacata 
aaaaaaaaaa 



cgggcgggtg 
ttccgctgag 
gggccggggt 
ggcccaggat 
gatcaagtcc 
tgactttttc 
ggagacccgc 
tggctacgct 
catcctgacc 
caagctccac 
catccccgcg 
gatagctggt 
cttcgccaac 
cttctggaag 
caagacccac 
gggtttttat 



60 
120 
180 
240 

300 
360 
420 
480 
540 
600 
660 
720 
780 
840 
900 
960 
1010 



<210> 37 

<211> 612 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 7990930CB1 



<400> 37 

taagaagtga tgggagctgc atctgctgca 
gtccacattc cagaaaatgt cgacatcact 
cctagaggaa ccctgcagag ggacttcagt 
aagaaaaagc agaggctcca gattgacaaa 
atgcgtgcta tctgtagtca tgcacagaag 
gacaagatga ggtctctgtg ttctcacttc 
tctattgtag aagtcagaac tttcttgagc 
ccaggtgttg cttgttcaat atcccaagcc 



agaatgaagg ctgttctcag taatcagact 60 
ctgaaggggc acacagttct tgtgaaaggc 120 
cacgtcagta tagaactcgg gctccttgga 180 
tgcatgctga gaagatggga attggctgcc 240 
atgatcaagg gcgttatact gggcttctgt 300 
cccatcaatg tcattatgca ggagaatagc 3 60 
gaaaaatata tctgcagggt tcgggtgagc 420 
cagaaagatg agttcatcct taaaggaaat 480 
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gacattgaac ttgtttcaaa ttcagctgct ttgactcagc aaaccacaac agctaaaaac 540 
caggatatca gaacattttt ggatggtatc tatgtctctg aaaaagggac agttcagcag 600 
gctgatgagt aa 612 



<210> 38 

<211> 2663 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 7037554CB1 



<400> 38 

ctgagcccta 

aaagggaaaa 

gtatccgggc 

gcgagtcgcg 

cggctgacag 

aagtaccaga 

aaaagggatc 

cttctgtcca 

aaagaatagt 

gatctgaaag 

ccagagaacc 

ggagggccaa 
gtgcaagcag 
accatgagac 
aggaaggagt 
atgaagaggt 
aggaggagga 
atgattatga 
atgggtctgt 
agagagctag 
catatgcagg 
aagatcaaac 
gtaacaacca 
taaatgagaa 
ctgtcagaga 
gaggatctcc 
tctttaaaat 
ccaatccttg 
ttgaatgtgg 
aggtcattca 
gtccatcccg 
ttcataacag 
ggtatttaaa 
gacgattttc 
gggaatttca 
tggaacaacc 
ccccttactc 
atgattatga 
ggagaagtag 
gacgagacag 
atcgagacag 
ctgattgttt 
aaatttattt 
cctttgttcc 
acagctgaca 



acgagaaggc 
agcgggaaga 
agacggactg 
tctaagcggc 
tcgggaggag 
acaagatgat 
aaaaagaaaa 
ttcaagacaa 
tagtacaaaa 
aaacaagcgt 
ttataagaat 
atctcctacg 
atccagccag 
tggcagcagt 
ggaagaagat 
ggatgaagat 
ggaagaagaa 
cactcgaagt 
cagatctggt 
aggcatatct 
ttcagaaaag 
cagtaaactc 
tgagaatgtg 
gaaattaaat 
gagtggaaaa 
tatacactgg 
tgactggatt 
gaatgaacat 
aacccagctt 
taaaatgcgt 
tcgagaacca 
cagaaagaaa 
ggatccacga 
aggagttcgc 
taacatggga 
tccacaccat 
aggacatcat 
tatgagggtg 
accccgtgaa 
agagcgagat 
agaccgaggg 
aaagatacaa 
tcagctgtct 
aagcatgcag 
gttgaattga 



tgggcctagg 
gtcgagaagg 
acggacgggc 
ggcggcggtg 
aaagatggag 
gaactgtata 
agtgatcgaa 
ctggtttcta 
ggaaagtcag 
ctagatgctg 
caacctgaaa 
ccagatggtt 
tcttctaagg 
ggttcttctg 
gtggaggaag 
ggagaggagg 
gaatatgaac 
gaggccagtg 
tcaggcacag 
ccaattgttt 
aagcatgaga 
aaatatgtgc 
tctcttgcca 
cttgcattta 
tttcaagggt 
gtgcttccag 
tgcaggcgtg 
aaaccagtaa 
tgtcttctgt 
cacaagagaa 
gtccgggatg 
ccaaggattg 
taccaggaag 
cgagatgtgt 
ccaccaccac 
ccttactatc 
ccagtaccac 
gatgatttcc 
agagaccggg 
agaggacgtg 
gagagaggtc 
aaaatcttgt 
gcctatgaag 
tatcataaga 
cat 



ccgctggatg 
ctgagtgtta 
ccgtgcttct 
gcagcggcgg 
aacttaatgt 
atccagagag 
tggaatctac 
agccactgag 
ccacagagta 
atcggaaaat 
aaacctgtgt 
ctgagagaat 
aagaagtgaa 
atgagcaagg 
atgaagaagt 
aggaggaaga 
aggatgagag 
actctggttc 
atggatcaga 
ttgatagaag 
aattatcatc 
ttcaagatgc 
aagcgaaggg 
gatctgcaag 
ttgcaagact 
caggaatgag 
aattaccctt 
agatcggacg 
ttccccccga 
gaatgcattc 
tgggaaggcg 
actatccccc 
tggacagttt 
ttttaaatgg 
cttggcaagg 
agcaccatgc 
atgaagcaag 
ttcgtcgcac 
aacgagagcg 
atagagaaag 
gatatagaag 
attttttttt 
ttcattgtgt 
actggaaaaa 



ctggagtgaa 
agaggccaag 
gccgcggctg 
aaaccgaagg 
tctggatgat 
tgaacaagat 
tgataccaaa 
ctcatctgtt 
taaaaatgag 
tcgtctatca 
ccggaaaagg 
tgggcttgaa 
ctctgaagaa 
gaacaacact 
agaagaagat 
ggaggaggag 
agaccagaaa 
tgaatctgtt 
tgagaaaaag 
tggaagctct 
ttccgttcgt 
aagatttttc 
tgtatggtcc 
gagtgttatc 
ttcttcagaa 
tgctaaaatg 
cactaagtcg 
tgatggacag 
tgaaagtatt 
tcagccccga 
tcgaccagaa 
tgagtttcac 
cacaaatctt 
gtcctacaat 
aatgccccct 
tccacctcct 
atacagagat 
acaagctgtt 
agaccgccct 
agaaagagag 
ataatgggct 
ggtgtgtgtt 
agaaggattt 
ctcaaatccg 



aggaagggag 
tgcgacgcgc 
cggcgcccgc 
ggagccatgg 
attttaactg 
aaaaatgaga 
cgacaaaagc 
agcaataaca 
gaatatcaaa 
agtagtgcct 
gatcctgaaa 
gtggatagac 
tatggctctg 
gagaatgagg 
gcagaagaag 
gaagaggagg 
gaggagggaa 
tccttcacag 
aaggaaagga 
gcatcagagt 
gctgtccgaa 
ctcataaaga 
acgctccctg 
ttaatatttt 
tcacatcacg 
ctgggaggtg 
gctcatctca 
gaaattgaac 
gacttgtatc 
tcacgaggac 
gattatgata 
cagagaccag 
attcccaaca 
gattatgtga 
tacccaggaa 
caagctcatc 
aaacgagtac 
gtcagtggcc 
agagataaca 
cgcttatgtg 
tttggaagca 
tacaagtagt 
attatgaccc 
ccagaaatcc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2663 



<210> 39 
<211> 7188 
<212> DNA 
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<213> Homo sapiens 



<220> 

<221> misc_f eature 
<223> Incyte ID No: 



1515347CB1 



<400> 39 

tagcgccttt 

gcttccggtc 

gcttctccca 

tggatcagat 

gacttgctaa 

gatgctcgtc 

ccaagtgagc 

agggtggcct 

ccgccacccc 

gcgccgtact 

ctgaggctgg 

ttgaaatctg 

ttagagatgt 

agtgagcaac 

attctctcca 

ttttatgaca 

atcgggagat 

aaattgttga 

tactccatgg 

atggatgatg 

tctgtcacgg 

attgagtatc 

actgatgctc 

gaggagctag 

ctggaattat 

atgactgcag 

cggctgcggc 

agcatggagt 

ccacccaccc 

tatgaagcca 

aagcgacaca 

gcggtcgtcc 

agagagggca 

cccctgccaa 

atcagtgagg 

ctcacaatcg 

tcctgtagcc 

attccacgag 

gcccaggatg 

atgactgctg 

aagaacccca 

cctcccatcc 

gctgatcagc 

cccccaccac 

cagccgccag 

ccacagcctg 

gccgtactgg 

gtgagtggaa 

atcaacaagc 

gctcccgccc 

gcgacccctg 

acagcctcgg 

gtgccccaag 

catttccagc 

cagcagcagc 

tctcaggtgc 



ctggccttgg 
cgctgctgtg 
ttggagggcc 
ttatttagtc 
ggatttgtgc 
gtgggaagga 
tgatgttgac 
ttgtgattcc 
tgtacagcca 
tccagcagct 
tgcagttcga 
aaggacgtcg 
tcttgaactt 
ggcaggaact 
ctcacagccg 
atgacctgaa 
gcaaagacat 
aaaatggaac 
ctttcttaac 
ctggcttccc 
aaaccattgc 
tggaggagga 
tgtcatcaga 
ctgacttcat 
tccatacttc 
tgagggcatg 
t^gagcagga 
atgtctacga 
cgccgcagga 
ctcccatccc 
aaacagaccc 
ctcctcggtc 
a-9"gagcagaa 
cttttgccaa 
actgggcgct 
tgtcacctgc 
gaatctaccg 
aggaggggaa 
agaatgccac 
gcaagaggag 
agcacgcgtc 
aggtggcatc 
agaaggcaca 
CGccgcagca 
cagggccacc 
tgcaggcccc 
caggaaccat 
atgtgatcgt 
gcctggcgtc 
aggtggtgca 
acctggtgtc 
ccgtggtcac 
tgtcccaagc 
ttctcaggca 
agcagcagca 
aagttccaca 



atcacccggc 
aatgcctcgc 
gacccaagga 
taacgagcgg 
cctgcctagc 
ggccgggcca 
gctttgtcgg 
tccggtggtg 
cagaatgagg 
gcggcagacc 
ctcagggaag 
ggtgctgatt 
ccattacctc 
gatgaggagt 
taccacaggt 
tccagtgatg 
ccacatatac 
taaagatctg 
tcagcgaacc 
ggtcaaagct 
acccaaaatt 
tgcccagaag 
ctctgagaac 
ggagcagctt 
tattgagcaa 
ggagttctgg 
ggaggcggag 
agatgtcgat 
cgacagcgac 
agaggctaag 
ctcagctgca 
cctgtttgac 
gaagaatatt 
acccacagct 
gctgcaggct 
tcacacacct 
ctcttccaaa 
gagtaaaaac 
acacacccag 
tcccccaatc 
tgtgttggca 
tctccgtgca 
gcagccggcc 
gccaccgcca 
agctgtccag 
agcgaaggcg 
taaaacatca 
gaacaccatc 
gccagtggct 
cacccagccG 
catggcaacg 
taccaacctg 
cacaggagtt 
gcagcagcag 
gcagcagcaa 
gatccagggc 



gcgtcccaac 
tttaggagaa 
ggaaaagacc 
cgctgttctc 
catggaaggg 
gcgcacagtt 
tgtggagagt 
gcagcacccc 
atcttgaggc 
acggctccac 
ttggaagctt 
ttatcacaga 
acctatgtaa 
ttcaacagag 
ataaaccttg 
gatgccaaag 
aggcttgtga 
atccgagaag 
atccaggagc 
gaggagtttg 
gcaagacctt 
tccgcacagg 
atgccgtgtg 
acaccaattg 
gaaaaggaga 
aacctgaaga 
ctcctgacct 
gggcagacag 
atctacctcg 
ctgccccctg 
ggcaggaaga 
cgcgcaacac 
ctgctgaagc 
gagcctggtc 
gtaaagcagt 
aattgggatc 
cagtgccgga 
aaccgtcctc 
ctgtacacga 
aaacctctgc 
gaaagtggaa 
gagcgaatcg 
gtggcccagc 
ccgctgccac 
ccccaacccc 
cagcGcgcaa 
gttactggga 
gcaggggtcc 
cctggggcct 
ccgccacggg 
actcagggtg 
accccagtgc 
cagctccctg 
cagcagcaac 
cagcagcagc 
caggcccagt 



cgcaccttga 
ccctgacggt 
agactgcttg 
aagctccagt 
tacagtggcg 
acacttcatc 
ctctgcagga 
cgtccctacg 
agggcctgag 
gcctgctgca 
tagctatctt 
tgattcttat 
gaatcgatga 
acaggcggat 
tagaggcgga 
ctcaggagtg 
gtggcaattc 
tggctgctca 
tgtttgaagt 
tggtgctttc 
tcatagaggc 
agggggtgct 
atgaagaacc 
aaaaatatgc 
gaaacagtga 
ccc tgcagga 
acacgcgaga 
aagtcatgcc 
actcggtcat 
tgtacgtgag 
agaagcagcg 
caggacttct 
agcaggtgcc 
aagacaaccc 
tactggagct 
ttgtcagtga 
atcgctacga 
tccgtacgag 
gccactttga 
ttggcatgaa 
tcaactatga 
caaaagagaa 
cacccccgcc 
aaccacaggc 
agccacagcc 
tcacgacggg 
cgagcatgcc 
cagctgccac 
tgactacgcc 
cagtcggctc 
ttcgagcggt 
agaccccggc 
gaaaaaccat 
aacagcagca 
agcaacagac 
ccccagcaca 



cccacaagtg 
ctccaaacca 
aaagagcgcc 
ctatggcaga 
fegggtccctg 

ctcagaaagt 
tgttattgac 
ggtgccgcgg 
agagcacgct 
gttccctgag 
gcttcagaaa 
gttggacatt 
aaatgccagc 
tttttgtgcc 
caccgtcgtg 
gtgcgatagg 
cattgaagag 
gggaaatgac 
ttattctccc 
tcaggaacct 
cctcaagagt 
gggaccacac 
atcccaatta 
tttaaattac 
ggacgcagtg 
gagggaggcc 
ggatgcctac 
gctctggacc 
gtgtctcatg 
gaaggagcgg 
tcacggggag 
gaaaattcgc 
attcgccaag 
cgagtggctc 
gcctttgaac 
cgttgttaac 
gaatgtcatc 
ccagatctat 
cfctaatgaaa 
tccctttcag 
caagccgctg 
aaaggctctg 
ccagccgcag 
agcgggcagc 
ccagacccag 
gggcagtgca 
cactggtgcc 
cttccagtcc 
gggaggctct 
cccagccacg 
cacttctgtg 
acggtctttg 
cacacctgca 
gcagcagcag 
gacgacgacc 
gatcaaagct 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

132Q 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2280 

2340 

2400 

2460 

2520 

2580 

2640 

2700 

2760 

2820 

2880 

2940 

3000 

3060 

3120 

3180 

3240 

3300 

3360 
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gtgggcaagc tgacgccgga acacctcatc aaaatgcaga agcagaaact gcagatgccc 342 0 

ccgcagcccc caccgccaca ggcccagtct gcgcccccgc agccaacagc ccaagtgcaa 3480 

gtgcagacct cgcagccgcc gcagcagcag agcccccagc tcacgacggt cacggcccca 3 540 

aggcctggtg ccctgctgac gggcaccacc gtggccaacc tccaggtggc ccggctcctg 3 600 

caggcgcaag ggcagatgca gacccaggca ccccagccag cccaggtggc cttggcgaag 3 660 

cctccggtgg tgtccgtccc ggcagctgtg gtctcctcac cgggagtcac caccctgccc 3720 

atgaacgtcg cggggatcag cgtggcgatc ggtcagccac agaaggcagc aggacagacc 37 80 

gtggtggccc aggcccgtca catgcagcag ctgctgaagc tgaagcagca ggccgtccag 3 840 

cagcagaagg ccatccagcc ccaggctgca cagggcccgg cagccgtcca gcagaagatc 39 00 

accgcacagc agatcaccac ccctggcgcg cagcagaagg ttgcctacgc cgcgcagccg 39 60 

gcccttaaga cccagtttct taccacaccc atctcccagg cccagaaact ggccggggcc 4020 

cagcaagtgc agacccagat ccaggttgca aaacttcctc aagttgttca acagcaaaca 4080 

cccgtggcca gcatccagca agttgcctct gcttcccagc aggcttctcc acagactgtg 4140 

gcgctcacgc aggcgacggc ggccgggcag caggtgcaga tgatccctgc agtgaccgcg 42 00 

actgcccagg tggttcagca gaaactcatt cagcagcagg tggtgaccac ggcgtcggcc 42 60 

ccgctccaga ctccaggcgc tcccaaccca gcccaggtgc ccgccagctc cgacagccca 4320 

agccagcagc ccaagttaca gatgagggtc cctgctgtca ggctaaagac acctactaag 43 80 

cctccgtgcc agtagtcagg gcagcagggc tgcctctcat ctaaagcaaa actaccttcc 4440 

tcacagaaaa cgctttatta gtgaaccttg ggaccatgtc acgcaagaga ttcagcactg 4500 

ggaaagatat aattgaaaca aaatagtgta atcattttat taaaatgcat cccacactgc 4560 

aggacaaatg gtccttatgg agtgccgcgt tctctgtact acgtggctca tggaaaaagt 4620 

gacaacatgg cttcctctaa atcatttcac ctttcagtcc ccacccgcac ccgtccccta 4680 

■gagccatagt actgtgttct gaaagccatt tagaatttct ttgtgagcat gtagtgcttt 4740 

gcacgccaca gaagccgtct gccgtgtgtg aggagcatac aatggacttt ctaaagataa 4800 

ggcgtgggct tccacagtgt ctgccagagt ttagttcttt ataccttact gaaaaatgcc 48 60 

tcgtggtctt cgcagagggg aaggcctgtc taaagtcaat catccgagat gggttttcca 492 0 

ttccaaagaa aggcaatatg gttccttcct tccctcctaa aatatgactt aacttttaag 4980 

agaaatgttc tgacacccac ctaaacacac aaggcacgtt cctggcctgt gttcaaggga 5040 

aatgatcagt cattgcattg ttattccaaa gagcagccaa cagtggcctc ccccaggccc 5100 

taccctgcaa tgggattcgc tttcatttaa tggaaacttc tgggactgat gcccaactca 5160 

gtgcactcaa gacgcatctc cagttttcgg gggaagctgg tatttgacat agtgtgttaa 5220 

acagctcctg agaacctttg ggacactctg ccatggctgg cgtgaggccc agaggaccac 52 80 

gcagaggcaa tggtagtaca gatgtcacag ctgagggtac gatgaggcct gggctcagtg 53 40 

agccaggacg aatgtgacag acaccccttg ctgccacagt cagccctttg acgaaggtgg 5400 

gctggtgatt ctggaagtat tggctatagc ggtgggccca gtcaactctt ccttgtggac 5460 

ttacgacagc agattttctc taggataagc ttgtgtggtt ctgccagtga agcagagaac 5520 

cacctgtgct gttgtggaag gcgtgccgtt gagggggaaa acgaagccca gtatttgcta 5580 

ctgtttttcc tttttttact atgacaggaa aataaatgca attttagfcgg aattgattga 5640 

cagtgtctcc ttactttgaa gttttcacca aagcaaaaag gtccatatcc aatagtatcc 5700 

tttgtgctgt ggcttgattt tggcctattt tacattattt ggtccaggaa attaggttat 5760 

attaggtttt ttgtatacta aaaatcagtt atggcacaat aaagattttc tgtttttaaa 5 82 0 

ttgtatttca tctgcttcct ccccattctc tcactttaag tgacattgag gaaggtattc 5 8 80 

tgtcccacag gtttctgtgg acagcgatac agcaggagtc agtgaaatca actggggagc 594 0 

tcacttgagc tcttgataag aaatgtggag aaaagtaaaa accaagcttt gaagaaacag 6000 

aagaaattaa tcttttagtt agttgaacat accaaagcag aggactggaa tctgtttgtt 6060 

ctaaccaacc cgttctccct ggcttggcac gtgccgtgag agcgcagctt gccggaggga 6120 

gggccgctgt gtgcgcctca catctggctc ccagtggaaa cttttactcc tcctcatccg 6180 

cagatgtgat agaactgaag tatctaggaa ttctgccttt gtcatttgtt ttaatttgtg 6240 

tgccctgttc attttttttg tctttcccaa atcttggtag tctccttata gttgaagata 63 00 

aaatgttgag tgcacttatt ttagaatatc ctagacataa ctgtctaagt aaaagcgctc 63 60 

tattaatcta aaacactaca agagaattta acaccatctc tcaaatgctt ttttggagag 6420 

cttaatggga ttctgaatat ttgcaatgtg gagtttccgc cccgatctca cgtcagtgag 6480 

ggtctcctgt ctctcaagtg tgtttccttt ggctgttccc taatacaaaa cacggacata 6540 

tttttactcg tagcactcaa tttagtaact tctagatgct accgttgacc tgagttaaat 6600 

tcatttagtc gtgtacgtaa aaactctcct tttagtgtgt tattttcttg gccttccctt 6660 

ttaaaggtta aagtttctaa cctaagaatt aagtacgcgt tcaggaagct gttgtctagg 672 0 

ccttcccctt gtgaatctgg gttcattcca atacggcaag taagagttgg aaactttgag 67 80 

aacacagact ataaaggcag cagcccgaac actgtcagac tctaattggc gaccctggga 6840 

aacagttgcc ctgctattct ttaaagaaag acgtttattc tgatgataaa aacagttagc 69 00 

cagactgttt ttaaagcacc tggcgggaag cagaaggttg gatccaagcc cttgttcaga 6960 

tttggtgcct gataagacag gggtttctct ttttgtgacc tttattatta ttattttgtt 7020 

aactgttgta accagttagc tgttgtgttt taagatagaa aggaacaaga ctaaaattgt 7080 

aaatactttg taaacatcag catttgtact tgaatagtag gattttaaag ggcattgata 7140 
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gcataccaaa caaaaggcaa aataaagtga cctttttata tatttttt 



7188 



<210> 40 

<211> 1972 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 
<223> Incyte ID No: 



3464492CB1 



<400> 40 

tatgtatctt 

tttaaaacac 

aaggttgagt 

gctgagaatg 

aagatggatc 

ttagtttttt 

ttaagcaagg 

aagaatattg 

gcctatcacc 

acaggagtgc 

gctcgaagag 

tataaacaga 

ctcatattgc 

aactgttaca 

ttgattggtt 

ttttttggtg 

gaatcactta 

gaagaagagg 

actatagatt 

cttgtgcttg 

cagtgtaacc 

caaaatgtag 

caagccatcg 

ctttatacct 

cgaggatata 

ttctgtgagg 

aagctgactt 

gagggtcgag 

gcaaatcctg 

attgtttcat 

gaagagttac 

catgaagcta 

aaagataact 



agaaaagcta 
tggaaagatg 
taaaagaata 
gcatgacttt 
ctgatcactt 
gtcctagtaa 
aatatctgaa 
gcaatggcaa 
acagtggctt 
tctgtctttt 
ttattttaag 
tgattggcag 
aagaaaaaga 
gccatcttgt 
tgaagattgc 
ttcagcaaaa 
gatacctgac 
tccaatataa 
tagcttattg 
aaagccttct 
ctgattggat 
ctgccattct 
gaaagaaggt 
tgctcaaaga 
tacaaaatct 
agcttgagga 
actgtgtaaa 
caaaacagtt 
aagtgctcgt 
cagcaaagat 
taagattgcc 
tctgatgaga 
tatactttat 



aactctttat 
aatttatatg 
tctgaaaata 
ttcacgtctt 
ggtagcattg 
gaagaactgt 
acataaggag 
cctgtgtcct 
aacaagtgat 
tacctgcaca 
agctccctat 
agctggtcgt 
caaacaacag 
tcaggaattc 
aacgaatctt 
ggttttattg 
agaaaaagga 
ttttcatatt 
tgacattctg 
tcatctaatc 
gatatacttc 
tggagtctct 
ggacaagaac 
gaccaacatt 
tctcactgga 
gttttgggtt 
ggcagaatta 
atacagtgca 
aaggacaatt 
gctgttgcat 
ttctgatttc 
attttcatat 
aaatgaaaaa 



agctgtcctt 
tcgtattaat 
aatgatacaa 
cttaattata 
gtgacagaag 
gaaaatgtag 
aaagaaaaat 
gttttaaagc 
gaaaggaaac 
tctaccctag 
gttgctaagg 
gctggaatag 
gtattggagt 
accaagggaa 
gatgacatct 
aaagaaaaaa 
ctcctacaaa 
acaaagttgg 
tacagagact 
tacctaacaa 
aggcagttta 
gaaagcttta 
gttgtcaaca 
tggactgtat 
actgcctcat 
tacagagccc 
atccctctca 
ggttacaaaa 
gatcatttat 
gaaaaagcag 
ctggtgctgt 
atgaattatt 
atacatttca 



tcttaaccta 

tgcaaatgtc 
tatatgaagt 
agtattctga 
ttattcccaa 
cagaaatgat 
gtgaggtgat 
gcactatccc 
tcttggagga 
cggcaggtgt 
aatttttaaa 
atactattgg 
taataactaa 
tccaaacatt 
atcatttcat 
gtctctggga 
aagacactat 
gacgtgcttc 
tgaagaaagg 
ccccctatga 
gccaactcag 
ttgggaagaa 
ggctatatct 
ctgaaaaatt 
tctcatcttg 
ttttggtaga 
tggaagttac 
gtctaatgca 
caagacgcca 
aagccctgca 
ggcttcttcc 
tattgtacat 
gttatacatt 



ttttacaata 

cttfctaattc 
tgacagcaaa 
taccctaaaa 
ttattcctgc 
atgcaaattt 
taagaacttg 
atttggagtt 
ggcctactcc 
caacctacca 
gaggaatcaa 
ggagagfcatc 
accattggaa 
atttctctct 
gaatggtaca 
aataactgtt 
ttataagtct 
atttaaggga 
tcttgaagga 
tctggtttca 
tccagcagaa 
agcatcaggc 
gtcttttgtt 
taatatgcct 
tgtgttacat 
acttaccaag 
tggagtttta 
cttagctaat 
agccaagcaa 
agaagaggta 
actgacaaag 
gtgtgttatt 
ag 



<210> 41 

<211> 1857 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 
<223> Incyte ID No: 



1794336CB1 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1972 



<400> 41 

atggaagaat 

aactgtcatt 

aaccatgaaa 

agagagaaaa 

agtcaccaca 

gttaatacac 

tgtaaagaat 



ttaaaagcca 
ttgagcaaca 
acatgcccat 
tctctgaatg 
aaagaactca 
catgcctttt 
gttggaaggc 



tagtcctgag 
tcagggacaa 
ttttagccaa 
taaaaagtgt 
ttctaaagaa 
taaacaacag 
ctttgttcat 



agatcaattt 
gaggagggtt 
catactttac 
agaaaaatct 
ctttctgaat 
actattcaaa 
tgctcacact 



tcagcgctat 
attttagaca 
tcactcaaga 
tcagttacca 
gtaaagaatg 
atggtgacaa 
ttaaacatct 



ctgggaaggc 60 
acttatgatt 12 0 
attttatgat 180 
tttatttttt 240 
cacagaaatt 300 
atgcaatgag 3 60 
aagaattcat 420 
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aatggtgaaa 
cttactctac 
aaggccttta 
ccctatgaat 
ctgcgactcc 
catcgctcac 
cgggaatgtg 
tctggcaaga 
ctaaccctac 
aagactttta 
cctcatgaat 
caaagaattc 
tatcattcaa 
gggaaggcgt 
aagccagtca 
atggaaatat 
ttagtattta 
taactaaagt 
ttattgagaa 
accagtgaaa 
caaatctctg 
cacagaaagc 
caggcgtcct 
agaatgtaac 

<210> 42 
<211> 2454 
<212> DNA 
<213> Homo 



aacgctatga 
atcaaagaat 
gacagcgatc 
gtaagcaatg 
atactggaga 
atcttactat 
ggaaggcctt 
aaccttatga 
atcagaggat 
gacagtgttc 
gcatgatatg 
atactggtga 
gcttctcaca 
ttaatcatag 
ggtttcctct 
acattgtctt 
tttatttaaa 
gtaaattttg 
taaatttttt 
tatttcactg 
agataaccag 
atctttcgca 
tgtccctttt 
ccattctttg 



sapiens 



atgtaacgaa 
tcacactggt 
acagcttact 
tgggaaggct 
gaaaccttat 
acatcagaga 
tagctatcac 
atgtcatgaa 
tcatactggt 
acacctcaaa 
tggtaaggcc 
gaaaccctat 
ccatcagaga 
attacaactt 
cctccctccc 
tgttttattt 
taatccttga 
gtctcatttt 
tattgagaat 
aaacagtgaa 
tatgtgtatt 
tgtgtaggga 
cataaataat 
taagggtcca 



tgtgggaagg 
gagaaacctt 
caacatcaga 
tttattcgtg 
gaatgtaaag 
attcatactg 
tcaagcttct 
tgtgggaagg 
gagaaaccct 
agacatcaga 
tfctagacttc 
gaatgtaagg 
attcattctg 
aacttacatc 
catcctagtc 
aaggaaaaag 
aaatcaaatg 
aatgctgttt 
aaattttcta 
ataaaatgaa 
tcttcttcct 
ggagctttta 
cttgaatact: 
gaattttcct 



cctttaatta 
atgaatgtaa 
gacttcatac 
gctttcaact 
aatgtggaaa 
gtgagaaacc 
cacaccatca 
ctttttgtga 
atgagtgtaa 
gaattcatac 
attcacacct 
aatgtgggaa 
gaaagaaacc 
agactcttca 
tagcatcatg 
ttggtaagcc 
tttttaagtg 
ctgagttatt 
aaattacatg 
aacaatttcc 
ttatgctcaa 
ccccccatac 
tttccaggtc 
agaatggaat 



tggctcagaa 
agaatgtggg 
tggtgaaaaa 
tactgaacac 
gacttttagg 
ctatgaatgt 
gaaaattcat 
tggcttacaa 
ggaatgtggg 
tggtgagaaa 
tattcaacat 
ggcctttagc 
ttatcaatgc 
tactggcgag 
aaatttgttg 
accagtttca 
agaaagtgga 
gtattaacca 
tttcactgaa 
tcccttccct 
acaacctgta 
agacctttga 
agtacgttca 
gtttggc 



480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1857 



<220> 

<221> misc_f eature 
<223> Incyte ID No: 



2903694CB1 



<400> 42 

atggatcgag 

cagagacgtc 

tatgacattt 

aatgtaaact 

ctatacccag 

gagaccattc 

ttaccccctg 

tgtgtcatag 

caaagcaggc 

atgacaagag 

atcttagcga 

aacaggctgc 

aggtattcgt 

gagctgagag 

ctgaacatta 

gccgtgcctt 

gctgacccac 

tgggaagctg 

ccgcttctct 

ccctggcagc 

gctgggaggg 

aagatgtcac 

ccagaacagc 

ccacctcccc 

cccccacaac 

gctgctgctg 

gctgtggcgg 

cttattaaag 



atttagaaca 
ctagaaggag 
atgttgaaga 
tgttagaaaa 
gcaatcaggg 
ggctgcctta 
ctttgggtga 
tagaagttcg 
atattctfcct 
atggcgagaa 
cagctgaacc 
tgtacaacaa 
ggccctctgt 
tgtcgacttc 
ctaaagcggg 
cagaagtgga 
agctcccagt 
gctgtcaggc 
gtggtaaaat 
ccttcccaga 
cagtgagtca 
acagctccag 
ctgatcctgt 
gcacccaact 
aggcaggcag 
ctcctgctcc 
cggcggccgg 
ctagcaggcg 



ggctctggat 
atactcacct 
atgtggaaaa 
gcttgttagg 
gtattctgtg 
tgaagaaagg 
tgtcctggat 
tgactacagg 
acgtccaacc 
atggagccag 
actgtgtctt 
gcaaaagatg 
aaagccacag 
tggccaaaaa 
aagttgtgta 
tgtggagaaa 
ctggccagcc 
ctgggacacc 
acggccacgt 
tgaccattca 
ggcccaggaa 
tggcccagcc 
gtgggtccag 
tccctcaagc 
ccctcttaag 
tgctcctgct 
tggggcggca 
ccgtccagct 



cgcacagaga 
agggcgggaa 
gagcctgagg 
agagagtcct 
atgctccaga 
gcattgctgg 
aaagcttcgg 
cagtccagta 
atgcagactt 
gaagacaaat 
gatccttctg 
aataccgacc 
caggagcagt 
gaagaaagaa 
gacacgtgga 
cttgctaaag 
caggaggtag 
aagccaaaca 
aaaaaagcca 
gcttgtctca 
tcggtgcaga 
agtgtcagtc 
tcttcagtat 
tcaggaaaga 
cgtccatttc 
cctgctgctg 
ccaagccatt 
gccgggcgcc 



atatcactga 
aaactctgca 
atcctcagga 
tgccatgttt 
gagaagatgg 
actacttgga 
ttaacatttt 
atatgcaacc 
tagcccctga 
ttcctcttga 
tagcagttgc 
cgatggaaca 
ctgactgtcc 
aagtaggtca 
aaggcagacc 
ggtatcagtc 
aagacccttt 
tcatgcagtc 
ggcagaagag 
ggcctgggtc 
gcaaagtcaa 
agctctcttc 
ccgggaaggg 
tttcctcagg 
ctgctgctgc 
ctcctgcttt 
ctcagaagcc 
ccaccagatt 



aattgcccaa 
ggaaaaactt 
attgagaagc 
actggtcaat 
gtcctttgca 
tgcagaagaa 
tcatagtggg 
tcctggttac 
ggtgaagacg 
gagtcaactg 
ctgcactgca 
gtgcctccag 
acctcctcct 
gccttgtgag 
ctgtgatttg 
cgtcacagct 
cagacatgca 
gtttaatgat 
ccagaagtct 
agagactgat 
aggtccaggc 
atggaaaaca 
agagaaacat 
taacagtttt 
tcctgctgta 
agctgctgct 
ctctgtgcct 
cgtaaaaata 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 
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gccccagcca ttcaggtccg gacaggctcc actggcctaa aggccaccaa cgtggagggc 1740 
ccagtccggg gagcccaggt tttggggtgc agtttcaagc ctgtgcaggc ccctggctcg 1800 
ggtgcccccg ctcctgcagg aatcagtggc agtggccttc agtcctcagg aggtccacta 1860 
ccagatgcaa ggcccggtgc agtgcaggca tcttctccag caccccttca gtttttccta 1920 
aatactccgg aaggtctcag gcctctgaca ctccaggttc cgcagggctg ggcggttctg 1980 
accggcccgc agcagcagtc ccatcagctg gtttccctgc agcagctcca gcagcccaca 2040 
gctgctcacc ctcctcagcc agggccacag ggttccacac taggtttgag cacgcaaggg 2100 
caggccttcc ctgctcagca acttcttaat gtgaacctca ctggagcagg tagtggtctg 2160 
cagccccagc cccaggctgc tgtgttgagt ctgcttggct ctgcccaggt tcctcagcag 2220 
ggtgtccagc tcccctttgt cttggggcag cagccacagc cgctgctgct gctgcagcca 2280 
caaccacagc cacagcagat ccagctccag acacagcctt tgagagtcct gcagcagcca 2340 
gtgtttttgg caacaggcgc tgttcagata gtgcagccac atccaggtgt gcaagcaggg 2400 
agccagttgg taggtcagag gaagggaggc aagccaaccc ctccagctcc ctga 2454 



<210> 43 

<211> 4409 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> inisc_f eature 

<223> Incyte ID No: 6975426CB1 



<400> 43 

agttctggga gtcaagcttt tactaaactt gggtgactgg tcctaagtgg cgtacatcat 60 
actggagtga gaattgactt gcatgctatt gcttttggtt tctgcaggtt ttgtagtgca 120 
tccctgagca tgagcacaga atgaagcggc gtttggatga ccaggagtca ccggtgtatg 180 
cagcccagca gcgtcggatc cctggcagca cagaggcttt tcctcaccag caccgggtgc 240 
ttgcccctgc ccctcctgtg tatgaagcag tgtctgagac catgcagtca gctacgggaa 300 
ttcagtactc tgtaacaccc agctaccagg tttcagccat gccacagagc tccggcagtc 360 
atgggcccgc tatagcagca gttcatagca gccatcatca cccaacagcg gtgcagcccc 42 0 
acggaggcca ggtggtccag agtcatgctc atccagcccc accagttgca ccagtgcagg 480 
gacagcagca atttcagagg ctgaaggtgg aggatgcgct atcttatctt gaccaggtga 540 
agctgcagtt tggtagtcag cctcaggtct acaatgattt ccttgacatc atgaaggaat 600 
ttaaatctca gagcatcgac accccaggag tgattagtcg tgtgtcccag ctattcaaag 660 
gccaccccga tctgataatg ggattcaaca ccttcttgcc ccctggctac aaaattgagg 720 
tgcaaaccaa tgacatggtg aatgtgacaa ctcctggcca ggttcatcag attcccaccc 780 
atggcatcca gccacagcct caaccaccac cccaacatcc ttcccagcct tcagcccagt 840 
cagccccagc tcctgcccag ccagctcctc agcccccacc tgccaaagtc agcaagccct 900 
cccaactgca agcacatact ccggccagtc agcagactcc cccacttcca ccgtatgcat 960 
ccccacgttc tccgccggtc cagcctcaca caccagtgac aatctcgttg ggaacggccc 1020 
catccttgca gaacaatcaa cctgtggagt ttaatcatgc catcaactat gttaataaga 1080 
tcaagaacag atttcagggc caaccagaca tctacaaagc attcctggag attttgcaca 1140 
catatcagaa agagcagaga aatgccaagg aagctggagg aaactacact ccagcattga 1200 
cagagcagga ggtgtatgcc caggttgctc gtctctttaa aaaccaggaa gatttgttgt 1260 
cagagtttgg acaattccta ccagatgcca acagctccgt gcttttaagc aaaacaactg 1320 
ctgagaaggt tgattctgtg agaaatgatc atggaggcac tgtcaagaag ccccaactga 13 80 
acaacaagcc gcagaggccc agccagaatg gctgccagat ccgcagacat cctacaggaa 1440 
ccacccctcc agttaagaag aaacccaaac tgctcaatct gaaggattct tctatggcag 15 00 
atgccagcaa acatggtggt ggaacagaat cgttattttt tgataaggtc cgaaaggctc 15 60 
ttcggagtgc agaagcctac gaaaatttcc tacgctgtct tgttattttt aaccaggagg 1620 
tgatctctcg tgctgagctt gtgcaactag tctctccttt cctggggaaa ttccctgagt 1680 
tgtttaattg gtttaaaaac tttctgggct ataaggagtc tgtacatctg gaaacttatc 1740 
caaaggagcg agccacagag ggcattgcta tggagataga ttatgcttct tgtaaacgat 1800 
tgggctccag ctatcgagcc ttaccaaaga gttaccagca gcccaagtgt acaggacgga 1860 
ctcctctctg taaagaggtt ttaaatgata cctgggtttc cttcccttcg tggtctgagg 1920 
actctacctt tgtgagttcc aagaagactc aatatgaaga acatatttat cgttgtgaag 1980 
atgaacgctt tgagcttgat gtagttttag agaccaatct ggcaacaatc cgggttctgg 2040 
aagcaataca gaagaagctt tcccgcttgt ctgctgaaga acaagccaaa tttcgcttgg 2100 
acaacaccct tgggggcaca tcagaagtca tccatagaaa agcactccag aggatatatg 2160 
ctgataaagc agctgacatc attgatggtc tgagaaagaa tccctccatt gctgttccaa 2220 
ttgtccttaa aaggttgaag atgaaagagg aagaatggcg agaagctcag agaggcttta 2280 
acaaagtatg gcgagaacaa aatgagaaat actacttgaa gtcfcctggac caccagggga 2340 
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tcaactttaa 
agagtatcta 
gcccacacct 
ttatccacca 
aaatcatgca 
tggaggaaga 
acaatggtgt 
aaaaattaag 
tttttatgcg 
aacggcaaat 
agcgagacaa 
atgtagaaga 
tagactcatc 
cctttaccat 
atgagatctg 
gaggccagct 
ctgagcagct 
aggtccagct 
aagcagagcg 
agcttcgtga 
ggaagtgtca 
agaccatgga 
acaagatggt 
gggctcatca 
tagataaatg 
tcatgggtga 
tgcattttgt 
aactgcaaag 
cactcacaca 
catggccacG 
gaacatgaat 
ctgcctggag 
cagcaagagc 
tcacacttgg 
ataatcgaag 



acagaatgac 
tgatgagagg 
ctcacttgcg 
tgtgaagagg 
tcattttatt 
ggaagaagaa 
tgggggcagt 
aggaatggat 
actgcaccag 
tgaagaagaa 
gagtgacagc 
ttattaccca 
acagtatgaa 
ggacaaactg 
tgtgcaggtg 
gaacacacag 
aatgtcagat 
gactattgag 
ctggtcagac 
acatctagca 
acgtggtcga 
gaatgtggat 
gtatgtgatc 
gtcccatgag 
gaccaaggag 
ggggctggag 
gagcattaac 
ccagagcaga 
ctgaagaaac 
tgacctgtgt 
accttacaaa 
gggacggaag 
taaaactgga 
actgttccag 
cagtactgat 



accaaggtcc 
caagagcagg 
tatgaagaca 
cagacaggca 
ccagatttgc 
gagatggatg 
ccccctaagt 
gaagtataca 
attctctgcc 
aaccgagaga 
cctgccattc 
gctttcctgg 
gattcactga 
atccagagca 
actgaccttt 
aactcaagga 
gagaattgct 
cttctggaca 
tacgtggagc 
cagaaaccag 
gagcagcagg 
agtctggata 
aaatcagagg 
cgtgtaagca 
catgtgcccc 
ggcctggtgc 
aagtatcgtg 
taacttgggg 
aaggaagatg 
gtggctggtg 
gctgaaagct 
tccatgcaag 
ttctccbggg 
cacagtggga 
tactttccc 



tgaggtctaa 
ctacggagga 
aacaaatact 
ttcagaagga 
tctttgccca 
tagatgaagc 
ccaagttact 
acctcttcta 
tgaggctgct 
gagaatggga 
agctacgtct 
acatggtgcg 
gagagatgtt 
ttgtcagaca 
acctggcaga 
gcctcctgga 
ttaagcttat 
cagaagagga 
gatacatgaa 
tatttctccc 
aaaaggaagg 
agctggagtg 
actatatgta 
agcgtctaca 
gtgaaatggc 
cctgtaccac 
tcaaatacgg 
tgtgtgtggg 
cctttcaagc 
cagcctggca 
ggaactttcc 
ccaaagacat 
cctaactttc 
gattcagatc 



gagcttactc 
gaatgctggt 
ggaagatgct 
ggacaaatat 
aagaggtgat 
cacaggggca 
gtttagtaac 
tgtcaacaac 
acggatttgt 
acgggaagtg 
caaagaacct 
gagcctgctg 
caccattcat 
gctgcagcat 
aaataataat 
gtcaacgtat 
gtttattcag 
gaattcggat 
ttcagatact 
caggaatcta 
gaaggaagga 
tagattcaag 
tcggaggacc 
tcagagattc 
agcagagacc 
cacctgtgat 
cacagtattc 
gatgtgtgtg 
ctcactgggc 
ccaagtgggc 
ccaaagggtt 
gcaggttgct 
atgaagggac 
tgcggctcca 



aatgagattg 
gtacctgttg 
gctgctctga 
aagataaaac 
ctctcagatg 
gttaagaagc 
acagcagctc 
aactggtata 
tcccaagccg 
ctgggcataa 
atggatgttg 
gatggcaaca 
gcctacattg 
atcgtgagtg 
ggggccaccg 
cagcggaaag 
agccaaggcc 
gaccctgtgg 
acctcgcctg 
cggcggatcc 
aacagcaaga 
ctgaattcct 
gcGctgctcc 
caggcctggg 
agcaagtggc 
acagagaccc 
aaagcccctt 
tgggcctatg 
ctctctggga 
tacctgttag 
tgggtatagc 
tgcacacaaa 
cggatgctgt 
tagggagcgg 



2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3360 
3420 
3480 
3540 
3600 
3660 
3720 
3780 
3840 
3900 
3960 
4020 
4080 
4140 
4200 
4260 
4320 
4380 
4409 



<210> 44 

<211> 1290 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 4019390CB1 



<400> 44 

ccagcctctg tgtccctgtg acctgcagat attgggagat ccacagctaa gacgccggga 60 
ctccctggaa acctagaaat ggaaccactg acatttaaag atgtcgccat agaattctct 120 
ctggaggagt ggcaatgcct ggatactgca cagcgggatt tatataggaa tgtgttgtta 180 
gagaactaca gaaacctggt ctttttgggt attgctgtct ctaagccata tctgatcacc 240 
tgtctggagc aaaaaaaaga gccctggaat ataaaaagac atgagatggt agccaaaccc 300 
ccagtaatgt cttttcattt tgcccaagac ctttggccag agcagaacat aaaagattct 360 
ttccagaaag tgacactgag aagatacgga aaatgtgaat atgagaattt acagttaaga 420 
aaaggctgta aacatgtgga tgagtgtacg gggcacaaag gaggtcataa tacagttaac 480 
caatgtttga cagctacccc aagcaaaata ttccagtgta ataaatatgt gaaagtcttt 540 
gataaatttt caaattcaaa tagatataag agaagacata caggaaacaa acacttcaaa 600 
tgtaaagaat gtagcaaatc attttgcgtg ctttcacaac taactcagca tagaagaatt 66 0 
catactagag tgaattccta caaatgtgaa gaatgtggaa aagcctttaa ctggttctca 720 
actcttacta aacataagag aattcatact ggagaaaagc cctacaaatg tgaagaatgt 780 
ggcaaagcct ttaaccagtc ctcacaactt actaggcata agataattca tactgaagag 840 
aaacccaaca aatgtgaaga atgtggcaag gcctttaaac aggcctcaca ccttactata 900 
cataaaataa ttcatactgg agaaaaacct tacaaatatg aagaatgtgg caaagtcttt 9 60 
agccagtcct cacaccttac tacacaaaag atacttcaca ctggagagaa cctctacaag 1020 
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taagagaatt 1080 
catatcctca 1140 
tgaagaatgt 1200 
atggaagaga 1260 
1290 

<210> 45 

<211> 1516 

<212> DNA 

<213> Homo sapiens 



tgtaaagaat gtggaaaagc ttttaaccta ttctcaaatc ttactaacca 
catgctgggg agaaacccta caaatgtaaa gaatgtggca gagcttttaa 
aaccttaata aacaggagaa aattcatact ggagggaaac tcaacaaatg 
gacaagcttt taaccgatcc ttaaaactta ctgcacataa gaaaattcta 
aaccctacaa atgtgaagaa tgtggcaagc 



<220> 

<221> inisc_f eature 

<223> Incyte ID No: 986452CB1 



<400> 45 

ggaccccaga gagccctgag cagccccacc gccgccgccg gcctagttac cgtcacaccc 60 
cgggaggagc cgcagctgcc gcagccggcc gcagtcacca tcaccgcaac catgagcagc 120 
gaggccgaga cccagcagcc gcccgccgcc cccgccctca gcgccgccga caccaagccc 180 
ggcactacgg gcagcggcgc agggagcggt ggcccgggcg gcctcacatc ggcggcgcct 240 

gccggcgggg acaagaaggt catcgcaacg aaggttttgg gaacagtaaa atggttcaat 3 00 
gtaaggaacg gatatggttt catcaacagg aatgacacca aggaagatgt atttgtacac 3 60 
cagggtgcgg aggcagcaaa tgttacaggt cctggtggtg ttccagttca aggcagtaaa 420 
tatgcagcag accgtaacca ttatagacgc tatccacgtc gtaggggtcc tccacgcaat 480 
taccagcaaa attaccagaa tagtgagagt ggggaaaaga acgagggatc ggagagtgct 540 
cccgaaggcc aggcccaaca acgccggccc taccgcaggc gaaggttccc accttactac 600 
atgcggagac cctatgggcg tcgaccacag tattccaacc ctcctgtgca gggagaagtg 660 
atggagggtg ctgacaacca gggtgcagga gaacaaggta gaccagtgag gcagaatatg 720 
tatcggggat atagaccacg attccgcagg ggccctcctc gccaaagaca gcctagagag 780 
gacggcaatg aagaagataa agaaaatcaa ggagatgaga cccaaggtca gcagccacct 840 
caacgtcggt accgccgcaa cttcaattac cgacgcagac gcccagaaaa ccctaaacca 9 00 
caagatggca aagagacaaa agcagccgat ccaccagctg agaattcgtc cgctcccgag 9 60 
gctgagcagg gcggggctga gtaaatgccg gcttaccatc tctaccatca tccggtttag 1020 
tcatccaaca agaagaaata tgaaattcca gcaataagaa atgaacaaaa gattggagct 1080 
gaagacctaa agtgcttgct ttttgcccgt tgaccagata aatagaacta tctgcattat 1140 
ctatgcagca tggggttttt attattttta cctaaagacg tctctttttg gtaataacaa 1200 
acgtgttttt taaaaaagcc tggtttttct caatacgcct ttaaaggttt ttaaattgtt 1260 
tcatatctgg tcaagttgag atttttaaga acttcatttt taatttgtaa taaaagttta 1320 
caacttgatt ttttcaaaaa agtcaacaaa ctgcaagcac ctgttaataa aggtcttaaa 13 80 
taatctcttg ttaggaaaaa tgtccattaa taaggccagt cttcagcaaa actaaaacca 1440 
ttttgttcgt ttagctttcc tagtctgaca acgcaatact gttgaaccac agtcaaatat 1500 
aatgacaaca ttggat 1516 

<210> 46 
<211> 5123 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc__f eature 

<223> Incyte ID No: 2807579CB1 

<400> 46 

cggccccgga tgttggctgc tcttgctgcc gccgccgctc tcggggctgc tcggggcctc 60 
gaccgctcgc agtaggcacc gttgggacct ttcccggggg ggcggagacg cacggagcgg 120 
ccgggggctc ggccgggccg tcccggcctg cgcagcctga gattctcttt gtggccagat 180 
gggtttgtat ggacaggctt gtccatctgt aacttcatta aggatgacat ctgaactgga 2 40 
gagcagccta acgtctatgg actggttacc acagctcacc atgagagcag ccatccaaaa 3 00 
atctgatgct acacaaaatg cacatggaac aggaatttct aagaagaatg cactccttga 360 
cccaaataca actctggacc aggaagaagt ccaacagcac aaagatggga aacctccata 420 
cagttatgcc agcctcatta catttgcaat taatagctca cccaaaaaga aaatgacttt 480 
aagtgaaatt tatcagtgga tttgtgataa cttcccatat tatagagagg ctggcagtgg 540 
ttggaagaat tccatacgac ataatctgtc attgaacaaa tgtttcctta aagtgcctcg 600 
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atctaaggat gaccctggaa aggggtccta 
tgcgctgcct actcggccaa agaagagggc 
cactgatcag gatggtagtg atagcccacg 
gagtttggca tctgttaatt tgaacagtgt 
aagccatcca gaatcagtct ctcaatcatt 
tccagaaaga gacaagcaac tacttttctc 
atttcggagc ctttataagt cagtttttga 
catcccttct gaatcttccc agcagtccca 
cagtacagtg agcactcacc cacacagcaa 
tggcctcaac accacaggca gtaattcggt 
gcacacacag ccatctccac atcctcccca 
gcgttcccca cacccagcac cacacccaca 
ccagcatccc tctccacatc aacacataca 
aacacatcag gcacccccac ccccacaaca 
ttggtatgcg acacttgata tgctaaaaga 
gtcagatgta gacctttcac agtttcaagg 
caagaactgg tctttagacc aggttcagtt 
ctttacacaa actggcctta tacattcaca 
tgccatgcat ccaacaaaac cttcccaaca 
taggcaaaat ctccctcctt cagtgatgcc 
actcagcact ccaggaacaa cgatggcagg 
gatgccttcc caagccttcc agatgcggcg 
ctttgattgg gattcaattg tgtagggctt 
cctttctgtg cagtgaaggg aaaggtttaa 
atcactttac caatgttatc aaaattactt 
taacttactg cttttatctg acccaagcaa 
cctatgtagc tcctaactgt tgtgtgattt 
gatgttaacc acaagtgcca gactgatttt 
ttatataaag atacatatgt gtaaatatat 
ttctaattgg attattatgt cttgaaagtt 
tttctgcagg atgcttttaa ctgatgtagg 
cccagttccc cttatttaat cttttttaga 
caaggaaacc agaatttgac acactccaac 
catgaagaac tgaccaaatt ggttttgatg 
ttttctaaat ctgcattata atagctctaa 
gcttggctct ttaaacacat cagtgcttcc 
tgtcatttta atatttattg ctaccttctg 
ggagaaatgt gtctgaaagc acagaactat 
cttgccctgt tgcccaggct ggagtgcagt 
ctcccgggtt caagcaattc tcctgcctca 
caccgccatg cctggctgat tttttgtatt 
tgggctggtt gcaaactctt gagctcaggc 
gggattacag acatgagcca ctgcgcctgg 
cagataagac actcagagag acattgtgtc 
aatagatttc tctgtcttgg cattcttgct 
tctggaattt gaatcagatt tcatttgggc 
gtttccccac tcatctaggt tgtcttgaaa 
cttcagtgtt taatgtagat attatgtggc 
tatgcatgac atttggtccc tttctttgaa 
tggtggtttg ttttgtttca tatcatatca 
atgataaaac tagaagaaat gagggttttt 
tagacccccc aaaaagtgtc agttctaatg 
ataactgctc actagttgct ttttagcatc 
taattagcct tagtgattct atggttgagt 
tgtttctgtg tgtgtgtgct gtgtgtgaga 
taatggagtg ttgccttgaa tgaatcactg 
ttggggagaa aggaagagct ttatgtttct 
aggaagctca gttccagccc cttggatcaa 
aacttgcgcc tcttagaagg atcagaggca 
ttgaaaaaag aggggtgatt ttcaatggtt 
atagctgttt tgaaactgtt ttaattgctt 
atattggaat gcagtctcat actgagtgat 
aattgtgtga agcgaatttc ccatttggca 



ctgggcaata gacaccaatc cgaaggaaga 660 
acgatctgta gaacgggtaa cattgtataa 720 
cagtagcctt aacaacagtc tctcagacca 780 
tggaagtgtg catagttata caccggtgac 840 
aactcctcag cagcaaccac agtacaacct 900 
agaatataat tttgaagatc ttagtgcctc 960 
gcagtcactt agtcaacaag gtttgatgaa 1020 
cacttcatgt acctatcagc actctcccag 1080 
ccaaagcagc ctgtccaaca gtcatggcag 1140 
tgcacaggtc tcactgtctc acccccagat 1200 
tcgaccgcat ggtttaccgc agcatccgca 1260 
gcaacacagc cagctccagt cccctcaccc 1320 
gcaccatccg aaccatcagc atcagacgtt 1380 
ggtatcctgt aattctggtg tttcaaatga 1440 
aagctgtcga attgccagca gtgttaattg 1500 
tctgatggag agtatgagac aggcagatct 15 60 
tgccgatctt tgttcttctc ttaatcagtt 1620 
gagtaatgtt caacaaaatg tttgtcatgg 1680 
cattggaaca ggaaatttgt acatagattc 1740 
accccctggt tatcctcata tcccacaggc 1800 
ccatcacaga gccatgaacc agcagcacat 1860 
ttccctgcct ccagatgaca tccaggatga 1920 
gtttctgcaa gacaccagac cctaacgtta 19 80 
gagaatccag ttgagaaaac aaacttgcta 2 040 
ttgaagacaa tcagaaggat tttagctgga 2100 
gtactacatg tttgtctccc tgccagctgc 2160 
ggacggcttt ttgcatattt gtgtcagttt 2220 
tcagacggag cctattttgc tgcaagcagt 2280 
gtacaaaaat tactgaaagg cttcagtttt 2340 
attgtcagtt tttattcctt gttaggctat 2400 
aaactgaaag gaaatagatt ttttccaaaa 2460 
aatgtgggta atgaattcta tctaataagt 2 520 
aatccaaagg ggcatgttgc tcctgagcag 2580 
cttgggggat catagagtat ttatgtctgc 2 640 
aatttgttga ttggtaagaa attgggcatt 2700 
acattcacct atgtatttat tattcaaaag 2760 
tgaatgctta gctcctgtcg ggttcattaa 2820 
tattattttt ttcttttttt gagatggagt 2880 
ggcacgatct cggctcactg caacctccgc 2940 
gcctcccgag tagctgggag tgcaggtgca 3000 
ttagtagaga tggggtttca ccatgttgcc 3 060 
agtctgcctg cctcagcctc ctagagtgct 3120 
cccagcgtta cttttcttga taagaattta 3180 
attctctatc atcactcctt ttaccatgtg 3240 
gattcagact tattagttaa atcggatttt 3300 
aacaacagca gggcagggtt tctaaagcag 33 60 
ggaggataga gccacttaca gttttatttg 3420 
ctgctgtttc tgttgtcttg tatgtctgtg 3480 
atgcagcccc tcttacgggt tttttagtgg 3540 
cagtttgcat ggactgatta tcttagtttt 3 600 
ttttaatgaa tagattttga attgattctt 3 660 
gggaaatata ttccatcaag tcaaagagta 3720 
tctggtctta tcagccatgc taaatcactt 3780 
aatctctact tgaactaaac aaacatcttt 3 840 
gtgtgcgcgc gcgtgtgtgt gtgtgtgttt 3900 
ggaagccagc catggtaagg gctggtgagg 3 960 
ctgttgtttg gaccctactt ggcatgaaaa 4020 
cgaaaatcag aggattctgg aaaggcagcc 4080 
agatgagatg gcagcctgca gagtaaatgc 4140 
tgcttaagtc actgttttct agacaccaaa 4200 
gggtagcaat gtgcacttta aacaatttgg 4260 
ttgagtagaa ccactgatga tgattttata 4320 
atcatttact gatttgcagt gatgaatatt 4380 



58/78 



wo 03/006618 



PCT/US02/21971 



tttatgagaa 
tCGcacfcctc 
tggccagaca 
tagttgctaa 
aaagaatagt 
ggcatttgtg 
cgctcagatg 
agtattttga 
atgaaaatgc 
tcagtgtttt 
tgaacggcaa 
aggattgttg 
aactctagaa 



tttaaactta 
ctgtgatcca 
gggttacctg 
tgcagaacaa 
aaatgcatgt 
gcagtttttg 
tcctttgaag 
tattgttcat 
attggctttt 
aatgtttcta 
ccattatttg 
ctgccagaac 
aaaagaaaaa 



gcaagaatgg 
ggtggtccag 
gtgaggtgca 
gttctgtctt 
aaacaaatgg 
aaatgtaaat 
tgggagggaa 
tcagagggtg 
tgtgcagata 
cagttttgct 
taactgttta 
tgatatgcat 
aaa 



ccatggaggc 
gagcccagga 
gagagtccct 
gggcttaaat 
ggacactctg 
gtattcatgt 
tcaatccggg 
atgtgtaca.t 
caacctgctc 
attgcacgat 
gtgctgtaaa 
gaatggcact 



aaagccttca 
caggcctttt 
ctagtggcca 
tgactgaaga 
ttcaggagaa 
gtgttcttgt 
gataatttca 
atctatattg 
tctgtactgc 
ttcatatttt 
gaaatattcc 
taaaataaat 



cccagaccca 
cfcgtgggccc 
ttttgtatgg 
ctttaggggg 
taatccgact 
aaatacgtgt 
aatggaatag 
tatatatgtg 
tgttggacag 
gcctctatga 
aagtgtcatt 
atattatgtt 



4440 
4500 
4560 
4620 
4680 
4740 
4800 
4860 
4920 
4980 
5040 
5100 
5123 



<210> 47 

<211> 707 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_feature 
<223> Incyte ID No: 



5724273CB1 



<400> 47 

ccaggctctt 

ttccggtgtc 

aaggagtttg 

cttgacagtg 

ctggcttttc 

cttgcaattc 

tctctcagga 

tgttggagaa 

aaaaaaatat 

agtaaaaccc 

ttcgagggac 

aaaatacctt 



gtttcccatc 
cttaaggttc 
aagaggagaa 
aafcgtgtctc 
tggattctgt 
cagaaccatg 
ggagtgggaa 
ctatagtaac 
tttcagaaaa 
ttggccttga 
taaaaggaca 
cttacagaaa 



agcctctggg 
ttggactctg 
aaggattttt 
attacattga 
gcttcaccaa 
actgatgggt 
tgcctggacc 
ttggtgtcac 
tgatattttt 
ggcatccatc 
tcaagaggga 
aaagtaatct 



tccaaactaa 
cctgtgctgc 
gcatgtttag 
ataacagcta 
tagaggaatc 
tggtgacatt 
ctgctcagag 
tggatttgga 
gaaataaatt 
ttcagaaata 
tacttcagtc 
cttactccac 



ccacaggtct 
caggttgcca 
aaatcaaggt 
caggatgttc 
ccatggagca 
cagggatgtg 
ggacttgtac 
gtcaaaaaca 
tttcccagtg 
attggaagtg 
aaatgataat 
atcaaag 



ctgggtcctt 60 
gatttctaag 120 
tcaggagaat 180 
gagagctgtt 240 
ttaattagtt 300 
gccatcgact 360 
gtggatgtaa 42 0 
tatgagacca 480 
gaaggacaaa 540 
caaaagcata 600 
cagctatgaa 660 
707 



<210> 48 

<211> 2170 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 3614884CB1 



<400> 48 

tgtttcaggt tcatgcatgt tgtagcatga 
taagatttca ttgtatggat aggcgacaat 
gttgtgtgca attttgggtt attatgaata 
acaacacatg ttcatactgt tgggtggcat 
tttttgcctt ccatggcatg tttcaggcac 
agaactctgc aaattcccca aagaaggagg 
cttcaaggat gtggctgtgg tcttcaccga 
gaggaagctg tatcgagatg tcatgctgga 
tcaactttcc caccgagata cttttcactt 
gacagcaacc caaagagaag ggaattcagg 
tccagaaaca ggaccacatg aagagtggtc 
tgagttaact agacctcaag actccataag 
cccctcccag gttgacgcag gactatctat 
tgggaagtgt aaaaaattct tcagtgatgt 
ctcaggaaag atatcccata catgtaatga 
tctttgtctt catcagaaag ttcacatggg 



gtcaatattt catttcttta aatttccaag 60 
ttgtccattt acccatagga aggcattttg 12 0 
atgcccctgt gaacattcaa gtgctaattc 180 
cactgaccac cccttcatgt ctcttttttt 240 
aattctgtct tctcttgaac tgcataactg 3 00 
aaaaatgacc atgttcaagg aggcagtgac 3 60 
ggaggagctg gggctgctgg acgtttccca 420 
gaacttcaga aacctgctat cagtggggca 480 
ccaaagagaa gaaaagtttt ggatcatgga 540 
aggcaagatc caaactgagt tggagtctgt 600 
ttgccagcaa atctgggaac aaactgcaag 660 
tagctctcag ttctccacac aaggtgatgt 720 
aattcacata ggagagacac cttctgagca 780 
ctccatcctt gatcttcatc aacaattaca 840 
gtacaggaag agattctgtt atagctcagc 9 00 
agagaaacgc tataagtgtg atgtgtgtag 960 
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taaggcattt 
accattcaaa 
ttgcaaatta 
tcacaattcc 
ttatatatgt 
catgcaagag 
acttaatagt 
aaggagcttc 
accatataat 
tcagcgggtc 
tatgaattca 
tcagaagtgt 
cacgggagag 
tattttgaga 
gaagaggttt 
gccttacaaa 
tcaaagactc 
gcacagttca 
tgaggactgt 
tttaaatgat 
acaatgtatg 



agtcagaact 
tgtgagcagt 
cacacaggag 
cagcttcggg 
ggtaagagct 
aaatcattta 
cattgcatgg 
acttgtaggc 
tgtaatgtat 
cacaatggag 
cagggccatt 
gggaagggct 
agaccttata 
cataagagac 
actgagaatt 
tgtgaggagt 
cacagcagag 
tgccfctcaag 
gggaagcgct 
atataattat 



cacaactgca 
gtgggaaaag 
aaaaacctca 
aacatcaaag 
tccatagtag 
gatgtgatac 
accacacaaa 
aagatctttg 
gtgggaaggg 
aaacaacatt 
cacatcagag 
acattagtaa 
attgtaagga 
tccatactgg 
caaaacttcg 
gtggaaaggg 
aaaaactatt 
accaacaaag 
acgagaggcg 
tgtccatatt 



aactcatcag 
tttcagccgt 
tatttgtgag 
aatccatact 
atcaaatctt 
ctgtagtaat 
agagaaacta 
taagcatcag 
cttcaggtgg 
caagtgcgac 
agcctataga 
gtttaatctt 
atgtggaaag 
agaaaaacca 
tttccatcaa 
attcagatgg 
ccaatgtgag 
cgaccacagt 
cttgaatcta 
tatggttaca 



agaatccaca 
agatcaggaa 
gaatgtggga 
ggggagaagc 
aataggcatt 
agctttggtc 
tacaaatgtg 
atggaccata 
tcctcatgtc 
ggatgtggga 
gaagaagaac 
gacttgcacc 
agcttcaggt 
ttcaaatgtg 
agaattcaca 
gcctcaactc 
gattgtggga 
ggagaaaaaa 
gatatgattt 
gcatatttaa 



ctggagagaa 
tgtatgttca 
aggccttcat 
cattcaaatg 
ccatggtcca 
agagatcagc 
aagaatgtgg 
caggagacaa 
tttcaagaca 
agagatttta 
tgtataaatg 
agagggtcca 
gggcctcagg 
aagagtgtgg 
ctggagaaaa 
atctaaccca 
agagcagtga 
catccaaatg 
tatcattatt 
atacatgtat 



1020 
1080 
1140 
1200 
1260 
1320 
1380 
1440 
1500 
1560 
1620 
1680 
1740 
1800 
1860 
1920 
1980 
2040 
2100 
2160 
2170 



<210> 49 

<211> 2778 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 
<223> Incyte ID No: 



3794954CB1 



<400> 49 

ggagagtcgc 

tccagcctgg 

gcaagaaaga 

aagaaggagg 

tgagtagaca 

atggccccac 

cctttactcc 

gggcagctct 

actgaatcac 

ggcctggtaa 

gacccctctc 

gtctctctga 

agccatctgc 

tctgtctatg 

atatcattaa 

aggttcttct 

ccttgtaaac 

gtaggactaa 

agaaaacagg 

aaatgggtgg 

agagttccaa 

ttgaactaat 

ggacatgctt 

agaggagagg 

tacccctgaa 

tctctggaac 

gctcctgggg 

gatggattcc 

gtcccacctt 

gtgtgagaag 

tgacaagacc 



ttgaacctgg 
gcgacagtga 
atgttttatc 
aaatggtaca 
cctgtagcta 
ttgggaaatt 
tctcctcatt 
tttctcagat 
attttcctta 
ccatcaagga 
agacagactt 
gfcaagcatga 
ttcagggcta 
cttccgctgg 
aaaaaaaaaa 
tggttgggaa 
ttgggaacca 
agtgaagtta 
ttcattggga 
gggaagaggt 
acaataggaa 
tgacagctct 
tctcaactag 
gacattctga 
ctagaagcag 
ccggagcatg 
cccccttfctc 
ctgttaagac 
gccaggcatc 
acctttgggc 
agcaggtgct 



gaggtggggg 
gactccgtct 
tgtagcattt 
taatttcttc 
aacagaggtg 
cctccagtaa 
tccctgttac 
catttctcac 
aaaaacagct 
tgtgtcactg 
ttatggagaa 
ccttctccag 
ttggatagtg 
agaaagaaca 
gtaaaaaagc 
ttaggaggtt 
tgtagaggaa 
cgcccagaaa 
aggagagtga 
tggggtgggg 
acagtttttt 
attttccttt 
aaggaggaga 
gggtcacata 
aacctcccag 
atgagagctg 
ttcaggaaga 
cccacacatg 
aacaaacaca 
gaagacatca 
ctgagtgtgg 



ttgcagtggg 
caaaaaaaaa 
tgtgctgata 
ataaaaagag 
gtgatgagtt 
cattacactc 
caaaactatg 
cctctctcct 
gctgacagaa 
tgcttctctc 
tatgtcatgc 
ccactagaag 
tctgtctagg 
ctagggccaa 
aaatgatgct 
ctcataaagc 
gatagggatg 
ggccagctgg 
ttttctaggc 
atatgtttct 
gtattttttt 
cccatgagca 
agaacaatgg 
tacaggagat 
aatgttatcc 
ggattccatg 
tagtttctca 
cccccagtgt 
cactggggag 
cctcatcagg 
taagaatttc 



ccaggattgt 
aaaatgctgt 
gaaaaggcag 
taaggccagg 
tatggtgcct 
actcactcct 
tgccctaccc 
ctactaactt 
gatgctacat 
aggaggagtg 
aggaaaactg 
attggactct 
actctttatg 
tcccaagaga 
taattgtgga 
agagagaact 
cagggagtag 
gctggaagta 
agagaagagg 
ttcttgacct 
tcagaaatat 
ggatttccaa 
gtccctgacc 
ggaagtgaac 
agcgtgtctg 
cccagcagct 
aacctgctgt 
gggaaacagt 
agaccctaca 
caccagaaaa 
cgatgcaact 



gctactgcac 
ggcaagggag 
ggtccagtgg 
gagaagtgac 
ccagcttctc 
tgaccatcta 
agggacttct 
agcctttcat 
tttgtttcag 
gcggagcctg 
tgggatagta 
ggga-ggacgg 
tggcagggaa 
atgagagtaa 
gtagaacttg 
ggaactgggt 
gactgtttta 
tctgagtgga 
gacttaacag 
tcttttgtct 
agaattgctg 
ttcccaaact 
cccaggactt 
atgaggggga 
aagatactgt 
ccagaggaat 
gtagcacaga 
ttgtatgggg 
gctgcctcaa 
cccacctaca 
cccatctggc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 
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cagccaccag agagtgcatg cagaaggcaa atcctgcaaa ggccaagagg tggagagagc 1920 

cctggcacaa ggaaacggcc gcgtgcccca ccagtgccaa agtgtcacgt gtgcactgaa 19 80 

tgtggggaag agctttggcc gaaggcacca ccttgtgaga cactggctga cccacactgg 2 040 

ggagaagccc ttccagtgcc ctcgctgtga gaagagcttt ggccgaaaac atcacctgga 2100 

caggcacctg ctcacccacc agggacaaag tccccggaac agctgggaca gaggaacatc 2160 

tgtcttttga aatctgtttc cactacagct atggtcaagt ctatcagccg gtgctaccag 2220 

gagtcactgc cagggctgcc gttctcctga accccagtgg ccagaatcat aagccctgac 22 80 

cccatcccta gaaagatgag gtcccagcaa tggccagagc atttctcacc agttctgtga 23 40 

gatagcacat aaaaatagag ttctttgggc aaaacttttg ggaagcaatg catcctacat 2400 

gggctgatat tcagcctgag ctgttctcta gaggagagtg gtactggcag tttatggctg 2460 

aaatccattc tgattggttg gagtctatgc tataccagtt gttaaacatt ttgagtatca 252 0 

ctcttgcata ctgttactat tatattttct ctatatatag acagaaaggc cattttagaa 2580 

tattaaaggc tctgaaaatt tctgcagtag acccaactga aggttctatt aaggcagtgt 2640 

ttcctaaatg tatttgacct cagtagcctt tctatttaca tcccacagaa ctggtgacta 27 00 

tgaaacactc tggtgaccct gggttagaga gacccaggca acagagatag tgtccaggct 2760 

cagagcccct tttaatcg 2778 



<210> 50 

<211> 2478 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> inisc_f eature 

<223> Incyte ID No: 7399016CB1 



<400> 50 

cgccggcgtc 

aggagcgctg 

cctgcttcgt 

accgcggcgt 

ggtgcagttc 

aaatttgaaa 

agagccgcgc 

ctgccctcgg 

aacacttcct 

tctgccacgg 

cgagcatgga 

ttggtgtggc 

cccagttcta 

cggctggtcg 

agggagcgtg 

ggtgggtgca 

cctccgagta 

ccatggatac 

ggccatggga 

atgcccctca 

ccctgcccag 

cgggcttcGc 

agcagcttcc 

agggagacgt 

agtcctttga 

ccaagaagtc 

agaagcttcg 

gcacggccgt 

aggtccggga 

acctgcagcg 

gtggacaaac 

ccaagccttt 

accacatgac 

ggcggtttga 

agacccagga 

ctgtgaccac 



cccggcccgg 
tcaccgcggg 

gccctcgcgg 
ttcctggagg 
caggtttcac 
gatttcttgg 
gcccgagccg 
acctggagtc 
gccgctctgt 
gaagttttcc 
gaggccatcc 
tgtccgccag 
ccagtgccac 
ccggaagcct 
tctggtggat 
tggacatgcg 
ctgcggcgtc 
cagctccagc 
caaagagacg 
gacctcccag 
cacggatgtg 
acctcagcca 
atcttcaacc 
cttgagtgaa 
gccttaccca 
tgaagaacca 
ttgtgagagg 
gtaccgaggc 
gcggccctgc 
ccacgtgaag 
cttcaagcag 
gcagtgtgag 
caaacacaag 
gaaggcccac 
caaggccctg 
agagggccag 



cctccccggc 
caggtcgccc 
agccggaaag 
gcgcagcgcc 
tcttgaagtt 
accagcggcc 
cttgctgtgt 
cgggacgccc 
gcaggagcag 
tcgagaagcc 
gcagaggagc 
gaccccacct 
agccttctca 
tgtgcaaagg 
ctgatcacat 
gccagctgcg 
atccaggtcg 
tgcaaggcct 
gcgccacggc 
ggtagaggga 
gcccagcctc 
agcctgcccc 
tcggatgatc 
gatgaaaatg 
gaaaggaaag 
agaattcgga 
gaggagcttc 
gctgacggca 
ccccaccctg 
ctcatccaca 
cggaagcacc 
gtctgtgggt 
gctgagactg 
aacctcaatg 
cccctggagg 
gcggtgaagc 



cccgggctcg 
tggggtgccc 

gaggcggcac 
tgccccgggt 
gcttcgctgg 
acatccccca 
ccgggagccg 
tgtgctcagg 
gccgggctct 
tgcgcagcat 
gcgtgctcgt 
tgtctccgtt 
agtccttcct 
tcggtgccca 
ccagccccca 
gggccctgoc 
tgtggggctg 
tcttgctgga 
tgccccagca 
cagggacccc 
cttcggacag 
tttgcagggc 
gggtaaaaga 
acaagaagca 
tctctggtaa 
agaagccggg 
ccaccatcta 
tgaagaagca 
gctgcaacaa 
cagaggtgcg 
ttctcgtcca 
tccagtgcag 
agctggactt 
tacacatgtc 
cggaaccacc 
ccgaacccac 



gccagaaggg 
gcgcgtggga 

ctgcaattgg 
ctcgcggtcg 
agaggcctta 
cagggcggct 
cgcgaagggc 
agcctccttt 
cgccatgggt 
ctccgagagg 
acgggacttc 
tgtctgcaag 
gcagagggtc 
gcccccaaca 
gtgcctgcac 
ccaccttcag 
cgaccagggc 
cagtgcgctg 
ccgagggtgg 
agttggggct 
cgacgcggtg 
cccagggcag 
cgagttcagt 
aaatgcccag 
gaagagtgaa 
acccaagccc 
caagtgtcct 
catcaaggag 
ggttttcatg 
gaactatatc 
ccaaatgcga 
gcagcgggca 
tgcctgtgac 
catggtgcac 
acctgggcca 
ctgaggacgg 



acccggcgtg 
atcgccctcc 
cagacaggaa 
ctgggaggag 
gaggcctttg 
tgtcccgccg 
gctgcaggcg 
gccagctgct 
cactgtcgcc 
gcgcctggag 
cagcgcctgc 
agctgccacg 
aacgcctccc 
ggggcagagg 
ggcttggtgg 
aggacactgt 
cacgactaca 
gcagtcaagt 
aaccctgggg 
gagaccaaga 
gggcccaggt 
ttgggtgaga 
gacctttctg 
tcttcggacg 
agcaaagaag 
ggatggaaga 
taccagggct 
caccacgagg 
atcgaccgct 
tgtgacgaat 
cattcgggag 
tccctcaagt 
cagtgtggcc 
ccgctgacac 
ccgagcccct 
cagtgaggat 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 
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gagcacctct 
ccgcatggga 
ggctcaagta 
gactcggggc 
gatttgtaat 
tacacgggaa 



agcagcctgg 
gggtcggagg 
gccttcctct 
cggacagttc 
ccacttttta 
aaaaaaaa 



actccgcagt 
gtgctgcccg 
gctctgggac 
ataaataatt 
gtgcaacaag 



ggctgtgtca 
cccttggtgc 
cagtggttta 
gattcctttc 
agctccatgt 



gcctcaccct 
tggaggcggg 
ttttcccgca 
cccactaaag 
tatgcttgta 



tcgtgtgcac 2220 
cttggtgtcc 22 80 
aacgctgagt 2340 
cagtcgagga 2400 
ataaattatt 2460 
2478 



<210> 51 

<211> 1947 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc_f eature 

<223> Incyte ID No: 6996690CB1 



<400> 51 

gtggtggagt 

aggagaagga 

ccaaagtgtc 

caaatggcag 

ttcagtgaac 

gactgtgatg 

acactttctg 

agaacgtgga 

cagtcacatc 

caatgtggga 

gtagaaaaac 

aatagtcaca 

tgcttcactg 

tatcaatgta 

cgaatacaca 

ttttatctac 

gtatgtggaa 

ggaataaaac 

cataatcatg 

gcctttgcta 

tatatatgta 

agaattcaca 

ttttatttac 

gtggaaatct 

gaaattttat 

acatgaaaga 

tggaatcttt 

ttattttatt 

ggcatgatcc 

aggaggtgga 

tttcttgtcc 

tcagtattga 

aaaaaaaaaa 



ccaccctggc 
aaagttgagc 
agcacctcaa 
aaatccacaa 
actcatgcct 
agtatggaga 
tgttgaatca 
taggagacaa 
ttcaggcaaa 
gagcttttac 
cctacgaatg 
tgcgaaccca 
tttcttcaca 
aggaatgtgg 
ctggagagaa 
taactgaaca 
aatccttcag 
cctataaatg 
taaaaatcca 
catcttcaca 
aggaatgtgg 
ctggagagaa 
ttactaaaca 
tttaggaatt 
tggtacaaag 
actcatactc 
ttttattatt 
ttatttgaga 
cagctcactg 
ggttgcagtg 
ctttttctca 
tgtttaataa 
aaaagaaaga 



tgtggaccta 
acattgccaa 
caacattttt 

tggaggggaa 
taatgcacac 
aaactttccc 
gtgcagaaaa 
atcctttgaa 
taggataact 
ttactccaca 
taaggaatgt 
tactggggag 
cctagttgaa 
aagagccttc 
gccctatgaa 
ttttaaaact 
aagctcttca 
taaggagtgt 
tactggagag 
actcattgaa 
gaaaaccttc 
accctatata 
tttaaaaaca 
tctcatacca 
aatgtgagga 
tatgtgtccc 
ttattttatt 
cagagttttg 
caacctttgt 
agccaagatc 
gtttgattat 
agaatgtaca 
aaaagaa 



ttcaaactca 
ccggagttcc 
ttttggctta 

ctctgtgact 
atgggaactg 
atgttacaca 
gccttcagcc 
tacagtgact 
cacaatggag 
agccatgctg 
ggaaaattct 
aaaccctatg 
catgtaagaa 
gctgggcgct 
tgtaacgaat 
cacacagagg 
tgccttaaga 
gggaaagcat 
aagccctatg 
catataagaa 
cgtgcttctt 
tgtaacgaat 
cactgaagaa 
taattaccac 
ggccttcagt 
ctataaattt 
ttattttatt 
ctgttgtcat 
ctcctgggtt 
gcaccaccgc 
atttatatac 
atggcaaaaa 



gtctgtctct 
ccaagaatgg 
acaaaacatc 
ttatggaaaa 
aaaatacagg 
acagtgcccc 
tgccaccaaa 
gtgaggaagc 
aaacactcta 
tgtctgttaa 
ttagatattc 
aatgtaagga 
ttcatactgg 
caggccttac 
gtgggaaagc 
agaagccctt 
atcactttag 
tcactgtttc 
aatgtaagga 
ctcacactgg 
cacatctaca 
gtgggaaagc 
aaactctgaa 
atttgaattc 
tatccccatt 
taggaatgtt 
ttattttatt 
cacacaggct 
caggagaatt 
atgagagaga 
atatgtcagt 
aaaaaaaaaa 



tgactggagg 
caagttaaaa 
aaacggcata 
tggagaaatc 
ggacacttat 
tgctggagag 
tgttcaccag 
ctttgttgat 
tgaacagaag 
aatgcatact 
ttcatatctt 
atgtgggaaa 
agagaaacct 
taaacatgta 
ctacaatagg 
tgaatgtaag 
aattcacact 
ctcaagctta 
ctgtgggaaa 
agagaaaccg 
gaaacatgtt 
ctacaataga 
tgtaaggaat 
acactgaaga 
cacttcgaag 

agggagccaa 

ttattttatt 
ggagtgcaat 
gcttgaaccc 
gagattacta 
aaatctgttt 
aaaaaaaaaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1947 



<210> 52 

<211> 3553 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 7740866CB1 



<400> 52 

ggcgcggggt ccaagatggt ggcgctagga gccgcgaccc agtgatagcg gccgtggagg 60 
ggcccccgac cgagcgggag gttgggggta gcctggaggt gagaccccgc tgcgttcaca 120 
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gagctggcgg ccgcggcgcc tgcgtccacc ggcaagcgag gagtggagca gagctcatat 180 
gcctatgggg agaggcctgg gggaccgcag gaggatgtag gcgccgggtg cgcatgactg 240 
ggccttcttg ctctcggtcg gcttcttggt ctcggcgtgc cccattcccg gccctgtccc. 300 
gctactcttg cctgttccag ttccctcctg tgggtaccgc taacggcatc ttcccgaggc 360 
ccagaatcca ccgtcaggca taaattgatt attcctcgcc cacatctgtg ccagattctg 420 
aagacaggaa taagatgaag gaatggaaat caaagatgga aatttctgaa gaaaagaagt 480 
cagcaagggc tgcatccgaa aaactccaaa gacaaatcac ccaggaatgt gagttagttg 540 
aaaccagtaa ttctgaggac agattattga agcactgggt aagcccttta aaggatgcaa 600 
tgagacatct ccc ttcccaa gagagcggta tcagggaaat gcatattatc ccccagaaag 660 
ccattgtggg agagattggc catggatgta atgaaggaga aaaaatactt tctgcaggag 720 
aaagctccca tagatatgag gttagtggcc aaaacttcaa acagaagtca ggattaactg 780 
aacatcagaa aattcataat ataaataaga cctatgaatg taaggaatgt gaaaaaacct 840 
tcaacaggag ttcaaacctg atcatacatc agagaattca tacaggaaat aagccatatg 900 
tgtgtaatga atgtgggaaa gactctaatc aaagttcaaa tcttattata catcagagaa 960 
ttcatacagg aaagaaacct tatatatgtc atgaatgtgg aaaagacttc aatcagagct 1020 
ccaatctggt gagacataag caaattcaca gtggtgggaa tccctatgag tgcaaagagt 1080 
gtgggaaggc ttttaaggga agctcaaacc ttgtcctgca ccagagaatc cacagtaggg 1140 
ggaagccata tttatgcaat aaatgtggga aggctttcag tcaaagcaca gatcttatta 12 00 
tacatcacag aattcacact ggagagaaac cctatgaatg ttatgactgt ggacagatgt 12 60 
tcagtcaaag ttcacacctt gtcccacatc agagaattca cactggagag aaacccctca 1320 
aatgtaatga atgtgaaaaa gccttcaggc agcattctca ccttactgaa csLCcetgagac 1380 
tccacagtgg agagaaaccc tatgaatgtc acagatgtgg gaagaccttc agtgggcgca 1440 
cagcttttct taaacatcag agattgcatg ctggagagaa aattgaagaa tgtgagaaaa 15 00 
ccttcagcaa ggatgaggag cttagggaag agcagagaat tcaccaggaa gagaaagctt 1560 
attggtgtaa tcagtgtggt aggaatttcc agggcacctc agacctcatc agacatcagg 1620 
taactcatac aggagagaaa ccatatgaat gtaaagaatg tgggaaaact ttcaatcaga 1680 
gctcagacct tctgagacat catagaattc acagtggaga aaaaccttgt gtatgtagca 1740 
aatgtgggaa atcttttagg ggcagctcag atcttattag acaccatcgt gttcatactg 1800 
gagagaaacc ctatgaatgt agtgaatgtg ggaaagcctt tagccagagg tcacaccttg 1860 
ttacacacca gaaaatccat actggagaga agccctatca gtgcactgaa tgtgggaaag 192 0 
ccttcaggcg gcgttcactc cttattcaac atcggagaat tcatagtggt gagaaaccct 1980 
atgaatgtaa ggaatgtggg aagctcttca tttggcgcac agctttcctc aaacatcaga 2 040 
gcc tgcatac tggagagaaa cttgaatgtg agaaaacctt cagccaggat gaagagctta 2100 
ggggagagca gaaaattcac caggaagcga aagcttattg gtgtaatcag tgtggtaggg 2160 
ctttccaggg cagctcagac ctcatcagac atcaggtaac tcatacaaga gagaaaccat 2220 
atgaatgcaa agaatgtggg aaaactttca atcagagctc agaccttctg agacatcata 2280 
gaattcacag tggagaaaaa ccttatgtat gcaacaaatg tgggaaatct tttaggggta 2340 
gctcagatct tattaaacac catcgtattc atactggaga gaaaccctat gaatgtagtg 2400 
aatgtgggaa agccttcagc cagaggtcac accttgctac acaccagaaa atccatactg 2460 
gagagaaacc ctatcagtgc agtgaatgtg ggaatgcctt caggcggcgt tccctcctta 2520 
ttcaacatcg gagacttcat agtggtgaga aaccctatga atgtaaggaa tgtgggaaac 25 80 
tcttcatgtg gcacacggct ttcctcaaac atcagagact gcatgctgga gagaaacttg 2640 
aagaatgtga gaaaaccttc agcaaggatg aggagcttag aaaagagcag agaactcacc 2700 
aggaaaagaa agtttattgg tgtaatcagt gtagtaggac cttccagggc agctcagatc 2760 
tcatcagaca tcaggtaact catacaagag agaaaccata tgaatgtaaa gaatgtggga 2820 
aaactcaatc agagctcaga ccttctgaga catcatagaa ttcacagtgg agaaaaacct 2880 
tacgtatgca ataaatgtgg ggaatctttt aggagcagct cagatcttat taaacaccat 2940 
cgtgttcata ctggagagaa acctcatgaa tgtagtgaat gtgggaaagt ctttagccag 3 000 
aggtcccacc ttgtcacaca ccagaaaatc cacactggag agaagcccta tcagtgcact 3 060 
gaatgtgaaa aagccttcag gcggcgttca ctccttattc aacgtcggag aattcatagt 3120 
ggtgagaaac cctatgaatg taaggaatgt gggaaactct tcatgtggca cacagctttc 3180 
ctcaaacatc agagactgca tgctggagag aaacttgaag aatgtgagaa aaccttcagc 3240 
aaggatgagg agcttagggg agagcagaaa attcaccaag aagagaaagc ttattggtgt 33 00 
aatcagtgtg gtagggcttc cagggggcaa ggctcagacc tca.tcgga.ca. tcaggtaact 3360 
cataccagga gagaaaacca tatgaatgtt aacgaatgtg gggaaaactt ttcaatcaga 3420 
ggtcagactt cttgagacac tcctcgaatc cagggggaaa aactttgttg gaccatgtgg 3480 
gaccctaggg cgccaatttt tacccccgtc tctgggaacc ttaaccggtg gaggctaaaa 3540 
acggtgaaat aaa 3553 



<210> 53 

<211> 1760 

<212> DNA 

<213> Homo sapiens 
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<220> 

<221> mis cofeature 

<223> Incyte ID No: 8181605CB1 



<400> 53 

actctcccag 

ccagatgttc 

cttgatgcag 

ggaggcgtgg 

ggattccgga 

caccaccatc 

atgctcggac 

tgtaataaaa 

cctagaggct 

tgcccaggaa 

tggggctgcg 

cccaggagga 

tccggtgtaa 

cctccccgga 

aaccctgtaa 

accccgcggc 

agctgcaagt 

cgggcacagg 

gcgcacggga 

cgcacctcat 

gtgagaagcg 

tgcggccctt 

agcaccagcg 

cgccgcccgc 

cagaccttgt 

acatgtgagc 

gcaggcagca 

caagttgctg 

ctctaaacca 

ttgcaacctg 



attgaacaag 
ctgatggacc 
atcaagcagg 
gcagccgggc 
gcaggagaca 
ccgcccacct 
gggaccctga 
acagaagtcc 
cacgggaccc 
ggagcctggg 
agagcccgcc 
gacccctcct 
accgccagtg 
caacggagag 
ataccctggc 
gccccccggg 
cagcctgagc 
cggtggcggc 
tggcagcgcc 
ccgccatcgc 
cttcaccgaa 
cacctgcacc 
caaccatgca 
acctcctgat 
gaccgactgg 
gcctccagcc 
gggcggcgtg 
aactgatgat 
agtagaattG 
ttttgttttt 



ggaaggagca 
ccagtccagg 
agggtgagct 
agccagatat 
tctccacgga 
cctggcaaac 
agctcaacac 
aggaagagga 
tgtttggacc 
aaagccaggg 
cggcctgaga 

ggggactggc 

ggcctgaacc 
gccatcttgg 
cggaccaaag 
ggccggccct 
gcgcaccagc 
agcggcagtg 
cttcggtgtg 
atgctgcaca 
cgctccaagc 
gtctgcggca 
gcgggcgcca 
cccttcaaga 
acttgtggcc 
ccatagcccc 
gaagcttcag 
cactggagaa 
tctgtgaaat 



ctgcaactgg 
ctcggggccc 
ccagctccag 
tggggaggag 
tgccacctct 
ggatctccct 
agcagcctcc 
ggtggtggcc 
aggccaagcc 
caagctcctt 
gggacatggg 
tcttcggagg 
cgaggacggg 
accccagcca 
gctttggcca 
tcacctgcgc 
gcagctgtgg 
gcggcggcgg 
gggagtgcgg 
ccggcgagcg 
tcatcgacca 
aaagcttcat 
.agaccccggc 
gccccgcctc 
tcagcgtcct 
tgccggccgc 
gcagacgcgg 
agagaaacta 
gggagcgccg 



cgccgcctgg 
ccagttcccg 
gagcagcagg 
ccctggggcc 
ggtgtccatt 
ccccaccatc 
acggaagcag 
acacccgtac 
acacggttct 
ccccagccag 
tgagctcagt 
ggtccggtgg 
gcccgagggg 
ggccccaagg 
caagccaggg 
cacgtgtggg 
ggcgcccgac 
tggcggcagc 
ccgttgcttc 
gcccttcccc 
ctaccgaacg 
ccgcaaggac 
ccgaggccag 
caaaggacct 
gggacccacc 
acgtgtaaaa 
gacggggaga 
tccaccagga 
tagaatttta 



ccccaagatt 
ccccagacct 
ccctgggcgt 
tcagccagct 
ccaacttttc 
cctcttcagc 
atgtaaaaat 
atcctactga 
tccctagtcc 
gaccctgtgc 
cctgctgtgg 
ggctggaatt 
cttccttact 
ccattcaacg 
ctgaagaagc 
aagagcttcc 
gggtcgggcc 
ggtgggggca 
acgcgccccg 
tgcaccgagt 
cacacgggcg 
cacctccgca 
ccactcccga 
ttggcctcca 
gatggcgggg 
agccccgtgt 
accaaatgtc 
cagctgccac 
agtaatttaa 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1760 



<210> 54 

<211> 2772 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc„f eature 

<223> Incyte ID No: 8266487CB1 



<400> 54 

gtagcgaatt atgttatact ccccttagac ccatgcacca tttaccaact aatgctagaa 60 
acttaaagct tgtttccagc caaatacctc ttcttttatc acaaatgccc tagtgtatag 120 
ctgttgagag agcagcactg agcactctgg gtagcctgtg gctttttatt tccagagtgt 180 
ggagatttcc atctcaagtg tgaagacagc tcatggtttc aggaaaagcc cagagtttca 240 
gacctgctgt tctctcaagt ctcctggtat aatgtgtcaa tcagttccca acctatggac 3 00 
tgtgagctca tagggttcac agtggtcact ccagcgctct gtgaatgtgc ccaactgtgc 3 60 
cctgcctgtg cactaaatga tgaatatatc acacaataat aaaaacaact accatgtgac 420 
catctaccat ttacgcagac acaatcctag ggaagaaaga cggtgggtat atagtgaata 480 
tcggttgaag ggaattaata tctgatcttc tcaattaacg ggtattaaaa gtacaactat 540 
tatcatacac ccattcttga tattttttga aattagaaaa ggtttcttca acctgaaaag 600 
acggctccgg ccccagcacg cccggctacc gctgccccga gccgcagtgc gcgctggcct 660 
tcgccaagaa gcaccagctc aaggtgcacc tgctcacgca cggcggcggt cagggccggc 72 0 
ggcccttcaa gtgcccactg gagggctgtg gttgggcctt cacaacgtcc tacaagctca 780 
agcggcacct gcagtcgcac gacaagctgc ggcccttcgg ctgtccagtg ggcggctgtg 840 
gcaagaagtt cactacggtc tataacctca aggcgcacat gaagggccac gagcaggaga 900 
gcctgttcaa gtgcgaggtg tgcgccgagc gcttccccac gcacgccaag ctcagctccc 960 
accagcgcag ccacttcgag cccgagcgcc cttacaagtg tgactttccc ggctgtgaga 1020 
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agacatttat cacagtgagt gccctgtttt cccataaccg agcccacttc agggaacaag 1080 
agctcttttc ctgctccttt cctgggtgca gcaagcagta tgataaagcc tgtcggctga 1140 
aaattcacct gcggagccat acaggtgaaa gaccatttat ttgtgactct gacagctgtg 1200 
gctggacctt caccagcatg tccaaacttc taaggcacag aaggaaacat gacgatgacc 1260 
gg"aggtttac ctgccctgtc gagggctgtg ggaaatcatt caccagagca gagcatctga 132 0 
aaggccacag cataacccac ctaggcacaa agccgttcga gtgtcctgtg gaaggatgtt 13 80 
gcgcgaggtt ctccgctcgt agcagtctgt acattcactc taagaaacac gtgcaggatg 1440 
tgggtgctcc gaaaagccgt tgcccagttt ctacctgcaa cagactcttc acctccaagc 1500 
acagcatgaa ggcgcacatg gtcagacagc acagccggcg ccaagatctc ttacctcagc 1560 
tagaagctcc gagttctctt actcccagca gtgaactcag cagcccaggc caaagtgagc 1620 
tcactaacat ggatcttgct gcactcttct ctgacacacc tgccaatgct agtggttctg 1680 
caggtgggtc ggatgaggct ctgaacfcccg gaatcctgac tattgacgtc acttctgtga 1740 
gctcctctct gggagggaac ctccctgcta ataatagctc cc tagggccg atggaacccc 1800 
tggtcctggt ggccca.ca.gt gatattcccc caagcctgga cagccctctg gttctcggga 1860 
cagcagccac ggttctgcag cagggcagct tcagtgtgga tgacgtgcag actgtgagtg 192 0 
caggagcatt aggctgtctg gtggctctgc ccatgaagaa cttgagtgac gacccactgg 1980 
ctttgacctc caatagtaac ttagcagcac atatcaccac accgacctct tcgagcaccc 2 040 
cccgagaaaa tgccagtgtc ccggaactgc tggctccaat caaggtggag ccggactcgc 2100 
cttctcgccc aggagcagtt gggcagcagg aaggaagcca tgggctgccc cagtccacgt 2160 
tgcccagtcc agcagagcag cacggtgccc aggacacaga gctcagtgca ggcactggca 2220 
acttctattt ggtatgaagc actctattca gtcaccacca tataggtcac ttctctcata 2280 
ctcggtcttg aggatattct ggattaatcc tttctatgca gacgtttctg gtttacaaaa 2340 
ggacgcagcc ctggactaca agtctggaac tgacaagttc ttatgacctt gacaaatcac 2400 
cttaacccat ctgagcctta aattctcatt tatttcctgc ataaggagat ttggctaaat 246 0 
gctttctgag gtcctttgga gtcctgtagc tccatggtaa tgtgctcctt tccttgaaga 2520- 
ctgggggttt tgtaatgttg agatactttg cctctatgct tctcagctca tgaccagtcc 2580 
tagaagagga gtcgagacat aagccacctt cagaggttca atggaaactt taaaaccata 2640 
ccaaactctt ttttaaaatt agaattaaca aaaaaaaaaa aaagggtggg gtttatgagc 2700 
cttagttctt ggaggattat aagagtactt ccccagtttt gaggctggac agttaatata 2760 
ctttatatca at 2772 

<210> 55 

<211> 1151 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc„f eature 

<223> Incyte ID No: 5552784CB1 

<400> 55 

gaggtcgcgt agggcctatt atgatgattt ctacaggagg ttgaagagat aagacccttc 60 
cctgtgctcc ccccccccca ctccttaatt acggattgag caggggaggg gccggtgggg 12 0 
ctcaggtgag cacacaggga gaaagggacg tgggcggggc cttacagagg gtgagcgaat 180 
ccgaaaagac ctagaacctc gttgctggga gacaagtccc gccctgcagg cggcaccgga 240 
agtggccggc tgggatcagc ctttaagatg gcgtctcctc aggggggcca gattgcgatc 3 00 
gcgatgaggc ttcggaacca gctccagtca gtgtacaaga tggacccgct acggaacgag 3 60 
gaggaggttc gagtgaagat caaagacttg aatgaacaca ttgtttgctg cctatgcgcc 42 0 
ggctacttcg tggatgccac caccatcaca gagtgtcttc atactttctg caagagttgt 480 
attgtgaagt acctccaaac tagcaagtac tgccccatgt gcaacattaa gatccacgag 540 
acacagccac tgctcaacct caaactggac cgggtcatgc aggacatcgt gtataagctg 600 
gtgcctggct tgcaagacag tgaagagaaa cggattcggg aattctacca gtcccgaggt 660 
ttggaccggg tcacccagcc cactggggaa gagccagcac tgagcaacct cggcctcccc 720 
ttcagcagct ttgaccactc taaagcccac tactatcgct atgatgagca gttgaacctg 780 
tgcctggagc ggctgagttc tggcaaagac aagaataaaa gcgtcctgca gaacaagtat 840 
gtccgatgtt ctgttagagc tgaggtacgc catctccgga gggtcctgtg tcaccgcttg 900 
atgctaaacc ctcagcatgt gcagctcctt tttgacaatg aagttctccc tgatcacatg 9 60 
acaatgaagc agatatggct ctcccgctgg ttcggcaagc catccccttt gcttttacaa 1020 
tacagtgtga aagagaagag gaggtagggg ccaagccccc accccatccc actccccttc 1080 
cctccccaga tatttatgtg aaatgaactg cagctttatt ttttgaaata aaaactttta 1140 
aaaaccaaaa a 1151 

<210> 56 
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<211> 2230 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 7281230CB1 



<400> 56 
ggcacctcag 

ggggccagaa 
ggaggggacc 
agaaagtacc 

ccagatgagt 
ggtgggtgag 
ggcagcaggg 
gcctgaggcc 
cacactgctc 
ggccactgcc 
tgggggctgg 
gacagaggag 
tcacaggcgc 
ccaggtcctc 
aggctggggg 
gcagcaagcc 
ccgcagacgg 
gcccttcgcc 
agccagggct 
ccccgagtgc 
cagcggcgag 
gctcatccac 
caagtccttc 
gccctacgag 
ccggcgcgtg 
ccagagctcc 
tcccgactgc 
cagcagcgag 
gctcctgcag 
ccaggctttc 
gcctcatgcg 
ccggcgcacg 
cggcagttcc 
cgcagaatgc 
ctctggcgag 
gctcttaagc 
gaagcctttc 
tgcgccctga 



tgtgtccctc 
acgccgatac 
ttcgtggcct 
ctgggccatg 

gaggcatcag 
cccagctcca 
ccccgagggg 
tgctccaagg 
ccagagatcc 
ctggcagagc 
tctgggggct 
ggagaggagc 
gagatggagt 
tgcaatgtga 
gcctgggaaa 
acactggggg 
tgggcgggac 
tctccccaga 
cttacttgct 
ggcaaggcct 
aagccctacg 
caccaggaga 
ggccgaagca 
tgcccggagt 
cacacgggcg 
aacctggccg 
ggcaaggcct 
aagcccttcc 
caccagcgca 
gtcatgggct 
tgcgcccagt 
cactcgggcg 
ggcctggcgc 
ggcaaggcct 
aggccgttcg 
caccggcgca 
agccaccgtt 



aggcacctgg 
ccccattctt 
tgctgtgctc 
cagctgcact 

ccctggcGcc 
agagccctgg 
ccctggctca 
agcagatgct 
aggcctacac 
ggctacagca 
gggtgccagc 
aagaggctcc 
ccccaagagg 
agactgccac 
actccacgga 
cggcggacga 

ggggctgggc 

ggagccgggc 
gcagctcggc 
tcgcctggag 
cttgcaggga 
cacacagcgg 
ccacgctggt 
gcggcaaggc 
cgcggccgca 
agcacctgaa 
tcgtgcgtgt 
cctgcgccga 
cgcacactgg 
cctacctggc 
gcggcaaggc 
ccaagccctt 
SLCcaccggct 
tccgcggcag 
tctgcgccca 
cgcacacggg 
gcaacctcaa 



acacacaggt 
ccatttttac 
acaggctcac 
cccctcccag 
ccaggtcttc 
ccagtgcttc 
gctgcgtgag 
ggagctcttg 
gcaggagcag 
ggagtcagct 
ccccaggccc 
cctgggcccc 
gtggaccctg 
gaggggcctc 
ggttccgagg 
acagggaggc 
ccaggagcga 
tgccggtgcg 
gcgtgcGCca 
ctccaacctc 
gtgcggcaag 
cctgaagccc 
gcagcaccga 
cttcagctgg 
cgcctgccgg 
gatccacgcg 
ggcggggctg 
gtgcggaaag 
tgagcggccc 
ggagcaccgg 
cttcagccag 
cgcctgcgcc 
ttcgcacacg 
ctccgagctg 
ctgcagcaag 
cgagaggccc 
cgagcaccag 



gggcaaagat 
tcagtctcag 
ctttgtgttt 
gaaaggggca 
ccgagtccac 
tgggggttct 
ctgtgttgcc 
gtgctggagc 
tggctaggca 
gggccaggac 
caagaggagc 
ttccaggccc 
caggtggccc 
tctgaggggg 
gaggcagggg 
cccggcaggg 
gcctgcagac 
gggagtgcgg 
ggcgagaagc 
agccagcacc 
gccttccgcg 
ttccgctgcc 
cgcacgcaca 
aactccaatt 
gactgtggca 
ggcgcacggc 
cggcagcacc 
gctttccgcg 
ttcgagtgcg 
cgcgtgcaca 
cgctcca.a.cc 
gactgcggca 
ggagagcgac 
cgccagcacc 
gccttcgtgc 
tacgcttgcg 
aagcggcacg 



gctgacactg 
gtggtcccaa 
ccagctgcag 
ggatggctgc 
tggaactgat 
gctatgagaa 
agtggctgat 
agttattggg 
gccctgagga 
tccagatgag 
tggtccccag 
cacctccagg 
cagaggaagg 
ctgtgtctgg 
acggccagcg 
agctgggccc 
cgggcgttgc 
caaggcgttc 
cgtacacgtg 
agcgcatcca 
cgcactcgca 
cggactgcgg 
cgggcgagaa 
tcctggagca 
aggccttcag 
cacacgcctg 
ggcgcacgca 
agagctcgca 
ccgagtgcgg 
cgggcgagaa 
tactgagcca 
aggccttccg 
ccttcgcctg 
agcgcctgca 
gcaagtcgga 
gcgagtgcgg 

ggggccgcgc 



60 
120 
180 
240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2160 

2220 

2230 



<210> 57 

<211> 1976 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> incyte ID No: 7488424CB1 



<400> 57 

ggagcgcact tagaggaagc tgtgttttgg 
gage tec tga ccttgaggag tacttaacag 
ccttcaaaga acccagcgaa acatgaattc 
cacctgcccc atctgcctga actacttcat 
cttttgcagg ccctgtttct acctcaactg 



tgacctctga aactcagtac tgcggcgaat 6 0 
aattatttct cgaagaatca ttgtgggaac 120 
tggaatctcg caagtcttcc agagggaact 180 
agacccggtc accatagact gtgggcacag 240 
gcaagacatc ccaattctta ctcagtgctt 3 00 
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tgaatgccta 
ggcttcccgt 
tggcactcac 
gctgtgctcc 
tgaggaacac 
aaatcagaga 
tgtaaggcta 
agaaaaacat 
tttaagtaaa 
gatgaaaatg 
caggagtgag 
gcccatcact 
taatgaagcc 
tgaccgtcaa 
tcagactttc 
ggcctttggt 
agaggaggga 
cacctcccca 
ggattgtgaa 
catccctaat 
accagagaca 
ctcttcctta 
tctaatgtta 
tatacgtatt 
acagtatatt 
agacatgtct 
aaaaacttgg 
ctatattaca 



aagacaacac 
gccagaaaag 
agggagacaa 
agctctctgg 
cgggagaagc 
aacctgaacg 
gaagctatta 
aatttggaga 
gccaaaatgg 
tgccataaac 
tccgtgctgc 
ggactgaggg 
aacagtcata 
aatgcgcccc 
acctctggca 
gtctgtaata 
ctctttagtc 
gttacactgc 
gctagaactg 
tgctccttct 
aatcagaaat 
tgccttatca 
ttaaaactca 
ggttctttat 
ctcttttttt 
gaatgaagta 
agtgaagtct 
cttatccatc 



agcagagaaa 
ccagtctctg 
agaagatatt 
agcaccggta 
ttttaaagaa 
tggaaaccac 
gagctgagta 
tgctgaaaaa 
ctcacaggag 
cagatgtgga 
tgcacatgcc 
acaggctcaa 
tcttccgacg 
atatcactgc 
aatattactg 
agtattggaa 
ttgggtgtgt 
agtatgtccc 
tgagcttcgt 
cacctcctct 
gtgtttatct 
aacaggacaa 
tttattgtgt 
taaataattt 
tctttattta 
aaaatcaatg 
caatgataac 
aggtttcatt 



cctcaaaact 
gctattcctg 
ctgtgaagtg 
tcacagacac 
aatgcagtct 
cagaatcagc 
tcagaagatg 
gaaggggaaa 
ggagatttta 
gctacttcag 
ccagcctctg 
ccaattccga 
tggagatttg 
aacacctaca 
ggaggtccat 
agggacgaat 
taagaatgac 
aagacctacc 
tgatgttaat 
caggcctatc 
gctgtgggaa 
ataggttctg 
tactattaaa 
ttgaaaaatc 
tgactgtcac 
gaagacagtc 
tgggaaatgt 
gtattaatct 



aacattcgat 
agctctgagg 
gacaggagcc 
tgtcccgctg 
ttatgggaaa 
cactggaagg 
cctgcatttc 
gaaatttttc 
agaggaacgt 
gcttttggag 
aatctagagc 
gtggatatta 
agaagcattt 
agttttcttg 
gtgggggact 
cagaatggca 
attcagtgca 
aaccatgtag 
caaagctccc 
ttttgctgta 
cccctttatc 
ttttatgtct 
tatgctgaaa 
attattcatg 
tgagtgaaat 
gggatctttt 
ttttcttcct 
atcctttgag 



tgaagaagat 
agcaaatgtg 
tgctctgttt 
agtgggctgc 
aagtttgtga 
attatgtgaa 
atcatgaaga 
atcgacttca 
atgcggagct 
acatattaca 
tcagggcagg 
ctctgcctca 
gtattggatg 
catggggtgc 
cttggaattg 
atatacatgg 
atctctttac 
gattattcct 
ctatatacac 
ttcatctctg 
ccataaagcc 
tgaattgcat 
acgctaaaag 
atcatggcat 
aatagatgac 
gcttcatgca 
ctttatctaa 
gtaata 



360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1976 



<210> 58 
<211> 1357 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 
<223> Incyte ID No: 



7487110CB1 



<400> 58 

atgacaatgg 

agttcaggca 

gcggatttta 

gcgccttctt 

gcgacccagc 

ccgacgcaaa 

ccgccgccgc 

tcttttttaa 

agcaccagcg 

agcttcaggc 

fccccagagcg 

agccgtgaga 

gaccaccagc 

gatcgcatgg 

cagaaccgca 

gcagggaact 

ctgctgggca 

atgtaccgga 

ctcatccacg 

ccaggcaccc 

tatccctgcc 

cttgcagcca 

ggacccccaa 



aaggggccag 
gcccaggcat 
ggagtcaggc 
ctcctatctc 
atcatcacca 
gtttgcagcc 
ccccccagca 
ttaaggacat 
tatcctctcc 
caaagctcga 
acatcaaatg 
gtccccctgt 
tcaatcaact 
acctggctgc 
ggaccaagtg 
actcggcgct 
gcatggacag 
ctcctccagc 
gcctagggcc 
cacacccccg 
caacccggac 
actctggaag 
agaggagcca 



cgggtcgagt 
gatgaatgga 
caccccatct 
agtcaccatg 
cctccaccac 
tttgccccaa 
gctgggctcg 
cttgggcgac 
ccaccacacc 
gcaggaggac 
ccacgggaca 
gagagccaag 
ggagcgtagc 
agcgctcaac 
gaagcggcag 
gcagaggatg 
cactacggcg 
accccatccc 
tgggggacag 
gtgaaaacat 
tgctgcccgt 
gcagaggagt 
gccctctgct 



tttggaatag 
gatttccgcc 
ccctgttcgg 
gagcGcccgg 
agccagcagc 
cagcagcagc 
gccgcctcgg 
agcaaacctc 
ccgaagcagg 
agcaagacca 
aaggaggaag 
aagcctcgaa 
tttgagcggc 
ctcactgaca 
acagcggtgg 
tttccatcgc 
gcggcggctg 
cagctgcagc 
ccagccctta 
tgcagcgaag 
cctctccctg 
aagagaggaa 
ctccatc 



acacgatttfc 
cgctcggtga 
agattgatac 
agccgcatct 
cgccgccgcc 
cgctgccgcc 
cccccaggac 
tggcggcatg 
agagcaacgc 
aactcgacaa 
gagaccggga 
aagcaaggac 
agaagtacct 
cccaagtcaa 
gcctggagtt 
cttatttcta 
ccgctgccat 
ggcccctggt 
atccattgtc 
gcactgcaat 
ctccaggcca 
gatgcttacc 



gtccagtgcc 
ggccaggacc 
cgtagggacg 
ggtagcagac 
ggccgcggcc 
acagcagccg 
ttccacgtct 
tgcaccctac 
agtgcacgag 
gcgggaggat 
gattacgagt 
agctttttcc 
gagcgtgcag 
gacctggtac 
gctggccgag 
tcacccaagc 
gtacagcagc 
gccccgtgtg 
cagccccatc 
cccttcccca 
ctaggcctcc 
agtgggcagg 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1357 
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<210> 59 

<211> 2153 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 7495008CB1 



<400> 59 

gcgcgcgtga 

ctcggggcct 

tccacctcct 

gctgcgcgcg 

gcgcacccgg 

cgcaagatca 

ctgcgcgagg 

ctctccaaga 

ctgcaggagc 

ctggccgggc 

gtaggacccc 

ccgtgcggcc 

gccgtgtgca 

caattctcca 

gctcagcttc 

catttcgaac 

ccctgggacg 

tctgggtcgg 

tccttgtccc 

tggttggcgc 

agcacgccgg 

ttcttttcct 

atcaaaggag 

tctgggacgt 

gggtaggggc 

ccagctgttt 

ttcgctctta 

gtfctcccttt 

accgccttgc 

ccgtcctgtc 

gaaccgcccg 

ctgttcctca 

cttgcagttg 

cctttaccca 

gttttggttg 

tcctgacggt 



acgcggtccc 
ccctctacga 
ccacctcctc 
agaagccgga 
gcggcagcgc 
acagccgcga 
tcatcctgcc 
tagccacgct 
tgcgccgcgc 
tgcccctgct 
ccgacgcgct 
agttcgctct 
agttcccgca 
agtgagggcg 
tccgcgcccc 
cttccagtcc 
ttaaagtgac 
ttccagcggc 
tggagttgcg 
gccccgggtg 
ctcccagtac 
tctcctccgc 
gctgccggaa 
ggcagacgga 
gaggacaacg 
ggagagctgt 
aatcctgggg 
ccccaaaagt 
atccagtgtt 
aagagcgcta 
ctgctcggcg 
gcgggccggg 
aagagctaca 
aggttatgct 
ttggtagtag 
gtacagaatc 



cgggaccatg 
gctggtgggc 
cacttcctcc 
ggcgccggcc 
ccggccggac 
gcggaagcgc 
ctactcagcg 
gctgctcgcc 
gctgggcgag 
cgccgccgcg 
gcgccccgcc 
ccccggcggc 
cctggtcccg 
ggcctgggcc 
tgctccctgc 
agaggaaggg 
cagagcggat 
fcttaggcaga 
cgcttcgcgg 
cagcgagagg 
taggggctgc 
cagaggccac 
ctcaagaggc 
cggaccctcg 
cagggtgcgc 
atttaagact 
gtttcttaga 
agcgtaacca 
cccgatttac 
atgaacgttc 
gatcccagct 
cccttgacca 
tacgtagtca 
atgacctttc 
ccgaatttaa 
aacaaaataa 



ctgcggccac 
tacaggcagc 
tcctccacga 
gagcctccag 
gccaaggagg 
atgcaggacc 
gcgcactgcc 
cgcaactaca 
ggcgccgggc 
cccggctccg 
aagtacctgt 
ggcgcaggcg 
gccagcctgg 
tggggcgcga 
gtctgggaga 
actgtcgggc 
gttcgatggc 
aagtgctcgc 
ggccgatgta 
ccatccccga 
gctcgagcag 
gggcgccctt 
agaaaaagac 
gcggacaggt 
tgggttggga 
cgcgtatcca 
aagcaactta 
acatttaagc 
taaaataggt 
tcattaacac 
gcggtggcga 
gcgcggcccg 
gtttcgattt 
cgcagtttac 
ctggcacttt 
aacatttaaa 



agcggcccgg 
cgccctcctc 
cggcccccct 
gccccgggcc 
agcagcagca 
tgaacctggc 
agggcgcgcc 
tcctactgct 
ccgccgcgcc 
tgctgctggc 
cgctggcgct 
gccccggcct 
gcctggccgc 
cctcggcccg 
gcgaggccga 
acccccttcc 
gcctcggggc 
tctcacccag 
gaacttaggg 
gcgctacctc 
tggcgggggc 
gttcccgccg 
cagttaggcg 
ggtcggcgtc 
cgtgggtcca 
gtgttttgtc 
gaactcgaga 
ttgcttaaaa 
aaccaggcgt 
gcaggagtac 
cggcgggaag 
caggtcttcc 
gttacagacg 
tttgattttc 
attttacttc 
gtctgafcttt 



agacttgcag 
ctcctcctcc 
cctccccaag 
cgggtcaggc 
gcagctgcgg 
catggacgcc 
cggccgcaag 
gggcagctcg 
gcgcctgctg 
gcccggcgcc 
ggacgagccg 
ctgcacctgc 
cgtgcaggcg 
gcctcccttc 
gcaaggaaag 
ccgcccccac 
agtttggggt 
cacatctctc 
cgccttgccg 
cccggagcgg 
ggaggggtgg 
gccaggtcct 
gtgcagacgg 

ggggtgcggt 

cttttgtaga 
gcagagagtt 
ttcacctttc 
acgaaaacca 
ctcacagtcg 
cgggagccct 
gcgctttccg 
ttctcgccgt 
ttaacaaatt 
tatgtttaag 
taaccttgtt 
tta 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 

2040 

2100 

2153 



<210> 60 

<211> 1104 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 7073515CB1 

<400> 60 

atgttcggga aaccagacaa aatggacgtt 
gtctcgaaga acgcgcacaa ggagagtcgg 
gccgccttcc tcaaggagcc gcagggcgcc 
aacaaaagta aatccaattc cgcagcggac 
gatgccaagg ggtccatccg agagatcatc 
aagaggacgc gcacgtcctt caccgcggag 



cgatgccact cggacgccga ggctgcccgg 60 
gagagcaagg gcgcggaggg gaacctccca 120 
ttctcagcgt cgggcgctgc tgaggattgt 180 
ccggattact gccgccggat cctggtccga 240 
ctgcGcaagg gcctggactt ggaccggcct 3 00 
cagctcfcatc ggctggagat ggagttccag 360 
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cgctgccagt acgtggtggg ccgcgagagg accgagctcg cccggcagct taacctctcc 420 
gagacccagg tgaaggtctg gttccagaac cggcgcacca agcagaagaa ggaccagggc 480 
aaggactcgg agctacgctc ggtggtgtcg gagaccgcgg ccacgtgcag cgtgctacgg 540 
ctgctggagc agggccgcct gttgtcgccg cccggcctgc ctgcgctgct gccgccttgc 600 
gccacgggcg ctctcggttc agcgctgcgc gggcccagct tgccggccct gggcgcgggc 660 
gccgctgcag gctcggccgc cgcagccgcc gccgccgccc cgggcccagc gggcgctgca 720 
tccccgcacc cgccggctgt gggcggtgct ccaggtcccg ggcccgccgg gccgggggga 780 
ttgcacgcat gcgccccggc cgcgggccac agcctcttca gcctgccggt gccctcgctg 840 
ctcggctccg tcgccagccg cctgtcctcc gccccgttaa caatggctgg ttcgctagct 900 
gggaatttgc aagaactctc cgcccgatat ctgagctcct cggccttcga gccttactcc 960 
cggaccaaca ataaagaagg ggccgagaaa aaagcgctgg actgatttta agtgtttccc 1020 
tgtatttata tttatagtat ctattgtggt gatatttatg gactcacgcg cattaggtgc 1080 
ccagagctcc tgcgctgggc tcct 1104 

<210> 61 

<211> 2597 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 3356640CB1 



<400> 61 

gcggggtctt tgtctcgctg cagcgggtgc 
tcttactcct agaggcGcag cctctgtggc 
ctaagacgcc aggatccccc ggaagcctag 
ccatagagtt ctctttggag gagtggcagt 
ggcatgtgat gttagagaac tacagaaacc 
ctggaacaag gaaaagagcc ctggaatatg 
tcagctctgt gttctcgttt tgcccaagac 
ttccaaaaag tgacactgag cagatatgga 
aaaggctgta aaagtgtgga tgagtgtaag 
caatgtttga aaattaccac aagcaaaata 
cataaatttt caaattcaaa tagacacaag 
tgtaaagaat gtgacaaatc actttgcatg 
catactagag agaatttcta caaatgtgaa 
aacctttcta aacctaagaa aattcatact 
ggaaaagcct ttcaccaatc ctcaatcctt 
aaaccctata aatgtgcaca ctgtggcaaa 
cataagataa ttcatactga agagaaaccc 
aagcagtccc caacccttac taaacatcag 
tgtgaggaat gtggcaaagc ttttaaccta 
tacactagag agaaagccta caaatgtgaa 
acccttatta cacataagat aattcatagc 
•ggcagagctt ttaaccagtc cgcaaagctc 
aaaccctaca aatgtaaaga atgtggaaaa 
cafcaagaaaa ttcatactgg ggagaaaccc 
aactggtcct caactcttat tacacataag 
tatgaagaat gtggcaaagc ttttaaccag 
catagtaaag agaaacctta caaatgtgaa 
actcttactg cacataagat cattcatact 
ggcaaaggtt ttagccaact ctcaaacctt 
aaaccctaca aatgtgaaga atgtggcata 
cataagatga ttcacacttg aatgaaaccc 
aactagttct cgaactttac tatgcataag 
tgtgaagaat gtggcaaagc ttttaaccaa 
tatactggag caaaaccttg gaaattcaaa 
ttcttacacc taaaattcat gcaggagaga 
tctttaacaa gtcttcaacc ctttctgcac 
acaaatatga agaatgtggt aatgctttta 
aatactgaaa atgttacaaa ccagaaaaat 
ttttctaaac ataaaggaaa tcatactggt 



tgcaggtctg gccttcactt ttctgcgtcc 60 
gctgtgatct ggttatcggg agattcacag 120 
aaatgggacc actgacattt atggatgtgg 180 
gcctggacac tgcacagcgg aatgtatata 240 
tggttttctt ggccagacct gattacctgt 300 
aagagacatg agatggtggt agccaaacat 3 60 
ctttggctag agcagaacat aaaagattct 420 
aaatatggac ataagaattt acagttaaga 480 
gaacaccaag gaggttataa tggacttaac 540 
tttcaatgta ataaatatgt aaaagtcatg 600 
ataagacata ctgaaaataa acatttcaga 660 
ctttcacgcc taactcaaca taaaaaaatt 720 
gagtgtggaa aaacctttaa ctggtccaca 780 
ggagaaaaac cctacaaatg tgaagtatgt 840 
actaaacata agataattcg tactggagaa 900 
gcctttaaac agtcctcaca ccttactaga 9 60 
tacaaatgtg aacaatgtgg caaggtcttt 1020 
ataatttata ctggagagga accatacaaa 1080 
tcttaacaac ttactgaaca taagaaaatt 1140 
gaatgtggca aagcctttaa ccagttttca 1200 
ggagagaaac cccacaaatg tgaagaatgt 1260 
actgaacata agttaattca tactggagaa 1320 
gcttttcacc gatactcaat ccttagtaca 1380 
cacaaatgtg gagaatgcgg aaaagccttt 1440 
ataattcaca gtggagaaaa accctacaaa 1500 
tcctcacacc ttatgagaca taagaaaatt 1560 
cagtgtggca aggtctttaa gaagtcctca 1620 
ggagagaaac cttacaaatg tgaggaatgt 1680 
actaaacaca agaagattca tactagagag 1740 
tcttttaacc agttctcaca acttgctata 1800 
tacaaatgtg aacgatgtgg cagttgtttt 1860 
aaaattcaaa ctggagagaa actctacaaa 1920 
gtctcaacac ttactataca taagataatt 1980 
gaatgtggta aaacttataa tcctcaaaac 2 040 
aaccccacaa atgtgaaaaa tttggtaaat 2100 
ataatataat tcatactgga gagaaacccc 2160 
accaattctc aaatcttact aaacaaaatt 2220 
gtgaaaatga ttttaacaaa accttcaaat 2280 
aagaaattat aaaaatgtga agaatgtgac 2340 
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aaagccttta aatggttgtc acacttgatt gtaggtaaga taattcatac tggcagaaac 2400 
tcccagaagt gtgaagaata tggcaaaact ttaattccta taccttattg cacaggaaag 2460 
catttatact tcagaaaatg ttgtactgat ataaagaatg tagaaaagcc attaatatgt 2520 
gcttacatct tattcaacat tagagagtta gtacttaata aaagcattat aaatgcaatt 2580 
actgtcaaaa aaaaaaa 2597 



<210> 62 
<211> 1959 
<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_feature 

<223> Incyte ID No: 2015706CB1 



<400> 62 

gggacgttga 

ttgagacgtc 

cgggctcaag 

gcagtctcct 

cctcagccta 

gaagaaac cc 

ctactgacat 

cctgctcaga 

ctggatatct 

gaagtgatcc 

tcccaggaaa 

aatgaccatg 

gatcaaaggc 

catttgcgaa 

gagaaagttt 

aaaccataca 

catactagaa 

gatcaaaaag 

tgtaatgagt 

catactggag 

gtcctcgtaa 

ggcaaggttt 

aaaccttaca 

cataggagga 

cggagtgatt 

tgtaatgaat 

catactggag 

caccttgaaa 

ggtaaagcct 

aaatcttaca 

catcagaaaa 

agcattaatt 

gatttcaagc 



gtccccacag 
cgggctggag 
caattctccc 
gatcttggct 
ccaagtagct 
agaagaggaa 
tcagggatgt 
ggactctata 
cttccagatg 
acacagggac 
ttgagaaaga 
aagcacccat 
atgctggaaa 
gacataggag 
tcagttgcaa 
aatgtaaggt 
ttcacagggg 
caacccttgc 
gtggcaagac 
agaaacctta 
ttcataaggc 
ttaagcaacg 
gatgtgaaga 
ttcatactgg 
cacgtcttgc 
gtggcaaggt 
agaaacttta 
gacataggag 
ttaattcacc 
aatgtcatca 
ttccttttgg 
gacatcagag 
attaatcgac 



acctggaaat 
ggcagttgtg 
acctcagcct 
cactgcaacc 
gagatacagg 
gaggaaagca 
ggccatagaa 
cagagacgtg 
catgatgaat 
attgcagaga 
cattcatgac 
gacagaaata 
caagcctatt 
aattcatact 
atcacatctt 
ttgcgacaag 
agacaaacat 
atgtcatcat 
ctttagtcag 
caaatgtaat 
agttcatact 
agcaactctt 
atgcgacaaa 
agagaaacca 
agaacatcag 
ttttagtaca 
cgaatgtgaa 
gattcataca 
ttcacacctt 
gtgtggcaag 
agacaattgt 
tcaaatcagc 
attaaactgt 



tgcggcccct 
cgatcttgat 
gagtctcact 
tctgtctcct 
attgacttct 
aaggagtcgg 
ttctctcagg 
atgctggaga 
acattgtcat 
caagcaagtt 
tttgtgtttc 
aaaaagttga 
aaaggtcagc 
ggagagaaac 
gaaatacata 
gcttttaagc 
tacacatgta 
agaagtcata 
acatcacacc 
gagtgtggca 
gcagagaaac 
gcaggacatc 
gttttcagtc 
tacaaatgta 
agagttcata 
aaagcatacc 
gaatgtgaca 
ggagagaaac 
attaggcatc 
gtctttagtc 
ttcaagtgca 
attgacctga 
ttatgttaa 



ctttctcaac 
tcacttcaac 
ctgtccttca 
gggttcaagt 
aaagactctt 
ggatggctct 
aggagtggaa 
attataggaa 
caacagggca 
atcacattgg 
agtggcaaga 
ctagtagtac 
ttgaatcaag 
cttacaaatg 
ggataattca 
atgattcaca 
atgaatgtgg 
ctggagagaa 
ttgtgtacca 
agaccttcgc 
cctataagtg 
gtagagttca 
gcaaatcaca 
aggtttgtga 
ctggagagag 
tcgcatgtca 
aagtttacat 
ctcacaagtg 
agagaatcca 
tgaggtcact 
atgagtatag 
gtttgagttg 



ccagagcaaa 
ctctgctctc 
ggctgtgagt 
gattcttgtg 
ggtacctgag 
ttctcagggt 
atgcctggac 
cctggtctcc 
aggcaataca 
agcattttgc 
agatgaaaca 
agaccgatat 
atttcatttg 
tgaagaatgt 
tactggagag 
cctggcaaaa 
caaggttttt 
accttataag 
tcatagactg 
tcgaaattca 
taatgaatgt 
cactggagag 
tcttgaaaga 
caaggctttc 
accttacaca 
tcaaaaactt 
tcgcaaatca 
tggtgattgt 
tactggacag 
ccttgcagaa 
caaaccatca 
acttaacatt 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1959 



<210> 63 

<211> 14Q1 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 6920755CB1 



<400> 63 

attttaacta aaggttatta tcataaagca 
actactgcct ggaggtggtt gatatatcct 
gaaaacatga gccagcaact gaagaaacgg 



ggtgtttgct gaagacagct tactcagatc 60 
ggtgtaaacc ttcaagaagg gcacaggcag 120 
gcaaagacaa gacaccagaa aggcctgggt 180 
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ggaagagccc 
gaaatagaac 
ggccctcagg 
ggtgagtttt 
aagaaaggat 
tctttggaat 
aattcgcttg 
ggcat tga cc 
aataaagaat 
ttgagggata 
tgtgcggaat 
catactggag 
gactttaatt 
tttcaaggct 
catgcaaata 
attaacagaa 
tttctaaatc 
tacactttgt 
ttagaacttt 
caggaaggtc 
ttggatataa 



ccagtggggc 
ctgtcagcgc 
ctctcggagg 
ctcaacccat 
cagaacaaca 
acatgaaaaa 
agtattctga 
tatcagatcc 
atgacagtct 
gagctgccct 
gtgggaaagc 
agaagccgtt 
tgcgtacgca 
gcaacaggag 
cgaacaagaa 
gagtgatcag 
aatattgcaa 
gataccgttt 
ttttatttgt 
agtgataaat 
gacttatttt 



taagcccagg 
ggtgtgggcc 
ggatgatttc 
cctggaagag 
gctttctcaa 
aggggtaaag 
gtacatgaca 
taaacagctc 
gagcgcaatc 
gagaaagcat 
gttcgttgag 
tcggtgcact 
cgtgcgcatc 
gtttattcag 
tgaacaagag 
tgacaaacat 
ccccaaaagc 
taaggacatg 
tttatttaga 
tttcaaaagc 



caaggcaagt 
ttatgtgatg 
tcagactgtt 
gactcacttt 
aaggttttcg 
aaagagcttc 
ggcaagaagc 
gcagaatttg 
gcttgtcctc 
ctcctcattc 
agctcaaaac 
tttgaagggt 
cacacggggg 
tcaaataacc 
ggaaagtagt 
gcctcattga 
ggttataatt 
gtgcattttt 
actttgtgtg 
ataaccttca 



caagccaaga 
gctatgtgtg 
acatagaatg 
ttgagtcctt 
aagcaagctc 
cacaaaagat 
ttccgcctgg 
ctagaaagaa 
agagtggatg 
atggtccccg 
taaagagaca 
gcggaaagcg 
agaaacgttt 
tgaaagccca 
cctccaacag 
ttattgtttc 
tg'Q^tgttact 
ttttctttta 
ttcttaaagt 
atatattatc 



cctgcaggcg 
ctatgagcct 
cgtcataagg 
ggaataccta 
ccttgaatgt 
agttggagag 
aggaatacct 
gccccccata 
cactaggaag 
agaccacgtc 
tttcctggtt 
cttctctctg 
cgtgtgtccc 
catcctaacg 
gatgaagcag 
taggaaggaa 
aagatgctcc 
tttgttttat 
gtgcttccaa 
tgttggatta 



240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1401 



<210> 64 

<211> 3406 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> inisc_f eature 

<223> Incyte ID No: 444179CB1 



<400> 64 

cactgtgtct 

ctccctggaa 

gtcagctggg 

cgagtcagag 

gggccacttg 

accaaggcgg 

ccacccccgc 

gaggagcgtc 

atcctgccct 

acacagggcg 

ttcgaggatg 

aggacactgt 

cgtgttaata 

gaaagaggaa 

ttaactccta 

agtcatcgtg 

tctaacctaa 

tgtgggaagt 

gagaaaccct 

ctgcatttga 

ttccgcacca 

gaatgtaatc 

attcatacag 

tcctatctga 

tgtgggaaat 

gagaaaccct 

aaacacttaa 

ttcacaagta 

actgcaggaa 

gagctcacac 

tgtggcttca 



ctgttgacct 
ttcaggagca 
tctgcatcat 
ctggccgtcg 
agcatcttcc 
aagcagctgt 
tgtctcctca 
ctgcttcccg 
cgactgctgc 
gagagctggt 
tggccgtgga 
acagggatgt 
aacccagtct 
ttctaccaag 
agaagaatgt 
gagtgaaact 
ctcagcacaa 
ccttcagtag 
atgaatgcaa 
gaattcacac 
gttgtaacct 
agtgtggaaa 
gggagaaacc 
cacagcacgt 
ccttcagcag 
acgagtgcag 
gaactcacac 
actcctatct 
cttctggagg 
tggatatata 
aacacgaaat 



gccagatagg 
gcttggctgg 
ctgaaggctc 
gtgggaggcc 
caccatggca 
agtgccttct 
gccatattct 
gctgccctgt 
tccgtcttcc 
taatgagctc 
gttcacccag 
gatgctggag 
gatatcccag 
cacctgtcca 
tttcagaaaa 
caatgaatgt 
gagaattcat 
cagatcttac 
tcactgtggg 
tggagaaaaa 
caaaagccac 
agctttcagc 
ttatgaatgt 
aagaactcat 
tagcttttct 
tgactgtgga 
tggagaaaaa 
ttctgtgcac 
aaagcactca 
agttatttgt 
gtttattctc 



aaatagcgcc 
aaggctctgg 
aacgggtacc 
tcagttcctc 
actggtgtcc 
aggacctggt 
gctggtcaca 
tgctgtcgga 
ctgttcccag 
ctgacaagct 
gaggagtggg 
aactgcagga 
ttggaacaag 
gatttggaga 
gaacagtcta 
aatcagtgtt 
actggagaaa 
cttactattc 
aaagcattta 
ccctatgaat 
aagaggattc 
acaaggtcct 
cacgattgtg 
actggagaaa 
cttactgtgc 
aaagccttta 
ccctatgaat 
aagagaatac 
ttgatctttc 
tgcagcataa 
taacaattfcc 



agagttgctg 
ctggaaactt 

gggggccact 
tccacatagg 
tctagagcaa 
cctgaaagtc 
cagagcgctg 
gtcacaggat 
cctctcagca 
ggctacgggg 
cgttgctgga 
acctggcctc 
acaagaaggt 
ctctacttaa 
aaggtgtaaa 
ttaaagtctt 
aaccctatga 
ataagagaat 
gtgatccctc 
gtaaccagtg 
acacggggga 
ctctcactgg 
ggaaaacctt 
aaccctatga 
acaagagaat 
ataatctctc 
gtaatcattg 
ataatagatg 
atccctaaga 
caaatgatcc 
tacagatcag 



caatccacgt 
gcagaaggct 
ccatggcctg 
cctctccaca 
catccaggag 
acccctgcac 
cgggataaag 
ggcggctgtc 
aaaaggacac 
cttggtaacc 
ccctgcccaa 
actagggtgt 
ggtgacagag 
agccaaatgg 
aacggaaaga 
cagcacgaaa 
ctgtagtcaa 
ccataatggg 
atcccttaga 
ttttcacgtt 
gaatcaccat 
gcacaatagc 
caggaagagc 
atgtaacgag 
acataccgga 
agctgtgaag 
tgggaaatcc 
gatatgaatt 
tagtttgaga 
ccgcaaattt 
gaattcaaga 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 
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acagcttatc tctgtagtta cactgcagga tctctcatgc ctttacttac aatcaagatg 1920 
tcagccaaag ctatagtcat ctaacatctt tactgatgaa cagtccactt ccaagatgac 1980 
ttattcacat gcctggaaag ttgatgttgg ttgttgtcag gggaccttca gtcattacca 2040 
tgtaggcctc tttacaaggc ccagctcctt gacaatttgg caataagttc tcgccaaggt 2100 
tagtgatcca agagaggacc aatacagagg ccaaaatggc ttttatgacc tatcctcagg 2160 
aggactgcta agctttcact tatttcctat tgtgttggtc tcacaacaca ccaatcctga 2220 
tacaatgtga gagactgcag aaggtatgaa taccaggaaa cagataactc agaagcatct 22 80 
gggaaatttg ccatggaatt aatgaagaaa agccctcagc acatcatcct tctactctaa 2340 
tcaatatgaa atagcattca gtaactccta ttaattcact ctgaagagaa accttttgag 2400 
gacaatctgt atggtaaagc tctcagctcg aattctcacc ttcatgggcc cagaagatta 2460 
tgtactgaga gaatcctgaa gaatggaaca gctgtagaaa gccttcagtg ctatcagcag 2520 
ggattcatgt gacagctcac actaagggag aaaacctgtg aaggtctcag aaggcttcag 2580 
tgacagtcat cccttaagac gcactggaaa tcacacaatg gaacaacact cagaaatgcg 2640 
gaataccctg cagcaaaagt gttcacatta ctgagcaaac tatagtgtgg tattctgcag 2700 
tgactatggg aaagccttga atgttctatc ggtttttaag ggacttgaga attaattctg 2760 
gagagaatgc cccattgaac atcatcaata ttggagagct ttctgttttt ctacatttgt 2 820 
taggaaactt gtgagcattc acactacaga gaaacttgaa atataaagaa gaagagaaag 2 880 
ccttcagtga tgcctctgtg ttagggaaaa tatggaactt ctccctggat gcaaaaccta 2940 
tgagtatatt aatattggaa aatttttcag tgattcttct ctttcttgta tatgagagaa 3000 
cttatatgga gaaaccccta ggaatgtaat cagtgttagg atgcctcagc ctgaactctt 3060 
cactgagtgg ccacaatttt cactgggaac aaaaagtata atcactgttt tgagtgtggg 3120 
atatccttta tcagtgtctc atctgtagat tggactgctg gctcattaat ttttttagtc 3180' 
tttttttctt ttaatataaa catttgtgta tagctgttcc ctaaaataaa cattaacata 3240 
tttcataatt ttaatgcaat gtatttatta taattaattt gcttgttaaa gacatccaca 33 00 
cattgcatat tcaaaaagtt atttccaaat accttctgag tgattcagtt tatcataatg 33 60 
gaaattagta tttataaaca catttttcta atgtagtggt attttt 3406 

<210> 65 

<211> 2718 

<212> DUA 

<213> Homo sapiens 

<220> 

<221> mis cofeature 

<223> Incyte ID No: 5628380CB1 

<400> 65 

caagaattag agacaagcgg tcagcagagc ctcagtgctg atcgtcggag cttgggagca 60 
ggaagatgtc gaatgaactt gatttcaggt ctgtgcggct gctaaagaac gacccagtca 120 
acttgcagaa attctcttac actagtgagg atgaggcctg gaagacgtac ctagaaaacc 180 
cgttgacagc tgccacaaag gccatgatga gagtcaatgg agatgatgac agtgttgcgg 240 
ccttgagctt cctctatgat tactacatgg gtcccaagga gaagcggata ttgtcctcca 3 00 
gcactggggg caggaatgac caaggaaaga ggtactacca tggcatggaa tatgagacgg 3 60 
acctcactcc ccttgaaagc cccacacacc tcatgaaatt cctgacagag aacgtgtctg 420 
gaaccccaga gtacccagat ttgctcaaga agaataacct gatgagcttg gagggggcct 480 
tgcccacccc tggcaaggca gctcccctcc ctgcaggccc cagcaagctg gaggccggct 540 
ctgtggacag ctacctgtta cccaccactg atatgtatga taatggctcc ctcaactcct 600 
tgtttgagag cattcatggg gtgccgccca cacagcgctg gcagccagac agcaccttca 660 
aagatgaccc acaggagtcg atgctcttcc cagatatcct gaaaacctcc ccggaacccc 72 0 
catgtccaga ggactacccc agcctcaaaa gtgactttga atacaccctg ggctccccca 780 
aagccatcca catcaagtca ggcgagtcac ccatggccta cctcaacaaa ggccagttct 840 
accccgtcac cctgcggacc ccagcaggtg gcaaaggcct tgccttgtcc tccaacaaag 900 
tcaagagtgt ggtgatggtt gtcttcgaca atgagaaggt cccagtagag cagctgcgct 960 
tctggaagca ctggcattcc cggcaaccca ctgccaagca gcgggtcatt gacgtggctg 1020 
actgcaaaga aaacttcaac actgtggagc acattgagga ggtggcctat aatgcactgt 1080 
cctttgtgtg gaacgtgaat gaagaggcca aggtgttcat cggcgtaaac tgtctgagca 1140 
cagacttttc ctcacaaaag ggggtgaagg gtgtccccct gasLCctgcag attgacacct 1200 
atgactgtgg cttgggcact gagcgcctgg tacaccgtgc tgtctgccag atcaagatct 12 60 
tctgtgacaa gggagctgag aggaagatgc gcgatgacga gcggaagcag ttccggagga 1320 
aggtcaagtg ccctgactcc agcaacagtg gcgtcaaggg ctgcctgctg tcgggcttca 13 80 
ggggcaatga gacgacctac cttcggccag agactgacct ggagacgcca cccgtgctgt 1440 
tcatccccaa tgtgcacttc tccagcctgc agcgctctgg aggggcagcc ccctcggcag 1500 
gacccagcag ctccaacagg ctgcctctga agcgtacctg ctcgcccttc actgaggagt 1560 
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ttgagcctct gccctccaag caggccaagg 
tgcggaggga gactgaggag gtgtttgacg 
ggctgaggaa tgcgatctct gagaagtatg 
acaagaaatg caagcgagga atcttagtca 
gcaaccacgt cgccttcctg ctggacatgg 
ttaaggagct gtaaggcctc tcgagcatcc 
ggacgtggcc ccacgccaca cacaacctct 
cttccctgag ggaagaggcc cttgagtcac 
cctagggggt cccctggcct ggatccccat 
tgccagtgcc tccccgtacc ccaaaacaat 
tgttccctcc tcccaagacc cttgtctgca 
ggtggcggca cacgctccct cccgcagcac 
tcttccttca acttcagaca aaggatttct 
tgattttcag tgcaaatgac ttttaaaaga 
ctcagcgcag gatgtaaata gcactaacga 
actactgcct tgccactcac tgttgtatac 
tatatatata aatatactgt atatatatgc 
tatcatttca aaaaatgtgt atttcacatt 
atgcattttg tatactcacg tggtatttag 
attacttcaa aaaaaaaa 



aaggcgacct tcagagagtt ctgctgtatg 1620 
cgctcatgtt gaagacccca gacctgaagg 1680 
ggttccctga agagaacatt tacaaagtct 1740 
acatggacaa caacatcatt cagcattaca 18 00 
gggagctgga cggcaaaatt cagatcatcc 1860 
aaaccctcac gacctgcaag gggccagcag 1920 
ccacatgcct cagcgctgtt acttgaatgc 19 80 
agacccacag acgtcagggc cagggagaga 2040 
ggtatgcttg aatctgctcc ctgaacttcc 2100 
gtcaccatgg ttaccaccta cccagaagac 2160 
gtggtgctcc tgcaggctgc ccgttaagat 222 0 
cacgccagct ggtgcggccc ccactctctg 2280 
caacctttgg tcagttaact tgaaaactct 2340 
cactatattg gagtctcttt ctcagacttc 2400 
tcgactggaa caaagtgacc gctgtgtaaa 2460 
atttcttatt tacgattttc atttgttata 2520 
aacattttat atttttcatg gatatgtttt 2580 
tcttggactt tttttagctg ttattcagtg 2640 
taataaaaat ctatctatgt attacgtcac 2700 

2718 



<210> 66 

<211> 3325 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 7493789CB1 



<400> 66 

gccgagcGcg 

cgctcctgcc 

gggcaagcca 

gcggcggcgg 

acttccggac 

ctgtgccgtt 

gggaagccgt 

gtcgcccccc 

ccccccagct 

catggtgccc 

ctgttttcaa 

agacaagtgt 

agtaactata 

caatccactt 

tggaaaagat 

gcgttcattt 

gtggtttgga 

tgtttctgcc 

tgatattcat 

caccaaagag 

ataccgtgtt 

agaaaacggc 

tcagctgaag 

cctgccacta 

caatcagact 

aattagcaga 

atttaaagtt 

ccagtatgga 

agggaaacaa 

acagaggcag 

taaggatgca 

agacagcgta 



ccgcatgtgg 
ccgacgtcgc 
ggcagcggaa 
cccgcagtcg 
gggactcccc 
ttccgtccgc 
cgccgccccc 
gggccgcctc 
ccatgaatgg 
agaagacctg 
gttgaaatcc 
cctaggagag 
tttggagacc 
cctgtggcaa 
cgacctttca 
ttctccgctc 
ttccatcagt 
actgccttct 
aatattgatg 
ataaaaggtt 
tgtaatgtaa 
caaactgtgg 
tacccgcacc 
gaagtctgta 
tccactatga 
ttggtaagaa 
cgggatgaaa 
ggacggaatc 
ttccacacag 
tgcagagaag 
gggatgccca 
gagcccatgt 



cccggctccc 
tccggcacgg 
ctgacgccgg 
tggaggagcg 
tctgtccgcg 
gactcttccg 
gcctcggggc 
cttgccgcca 
aaatcggctc 
gctatggcac 
caaagattga 
tgaacaggga 
gtagaccagt 
ctacaggggt 
aggtgtcaat 
cagaaggata 
ctgttcggcc 
acaaagcaca 
agcaaccaag 
tgaaggttga 
caaggaggcc 
agagaacagt 
ttccctgtct 
atattgtggc 
tcaaggcaac 
gtgcaaatta 
tggctcatgt 
ggacagtagc 
gagttgaaat 
aaatattgaa 
tccagggcca 
tccggcatct 



ggacacctcc 
ctcggggccc 
cgagcttccg 
gtgggagcgt 
cctcacatct 
gcccagagct 
cgagtgagag 
gtggcgggct 
cgcaggaccc 
catgggcaaa 
tgtctacctc 
ggtggttgac 
ttatgatgga 
agatttagac 
caaatttgtc 
tgaccaccct 
tgccatgtgg 
acctgtaatt 
acctctgact 
agtgactcat 
tgccagtcat 
agcgcagtat 
gcaagtcggg 
agggcaacga 
agcaagatct 
tgaaacagat 
aactggacgc 
aacaccgagc 
caaaatgtgg 
gggtttcaca 
gccatgcttc 
caagaacaca 



ccggcgtcct 
agaggcgagg 
gggcggcccc 
cggcggccgc 
ccccttcctc 
ttcggagtgc 
tgcccgtcgc 
ccgttctccc 
gctggggccc 
cccattaaac 
tatgaggtag 
tcaatggttc 
aaaagaagtc 
gttactttac 
tctcggtaca 
ctgggagggg 
aaaatgatgc 
cagttcatgt 
gattctcatc 
tgtggaacaa 
caaacctttc 
ttcagagaaa 
caggaacaga 
tgtatcaaga 
gcaccagata 
ccatttgttc 
gtacttccag 
catggagtat 
gctatcgctt 
gaccagctgc 
tgcaaatatg 
tattctggcc 



ccgcgccggc 
cgaggacgcc 
gggcaggtcg 
gggcgatgca 
tcgcctagtc 
ggttgctcag 
gtcgcgccgc 
tcgaagcact 
agcccctact 
tgctggctaa 
atattaaacc 
agcattttaa 
tttacaccgc 
ctggggaagg 
cacctgtggg 
gcagggaagt 
ttaatatcga 
gtgaagttct 
gggtaaaatt 
tgagacggaa 
ctttacagtt 
agtatactct 
aacacaccta 
agctaacaga 
gacaagagga 
aggagtttca 
cacctatgct 
gggacatgcg 
gttttgccac 
gtaagatttc 
cacagggggc 
tacagcttat 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 
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tatcgtcatc 
acttttgggt 
aactctgtca 
tgtacctcat 
cactcatcca 
ggatgcacac 
catccaggac 
gttcaagcct 
ggtattatat 
tcaacctgga 
tgataggaca 
agacattaca 
taccagtcgt 
acttcagctg 
tatacctgca 
ggacaaagaa 
tccacaagct 
cgcttaaata 
acgtatgttt 
ctgtgtgggg 
aggaacacag 
taggaagtat 
acctcaagtt 
atggtttaaa 



ctgccgggga 
atggctacac 
aacttgtgcc 
caaagacctt 
cctgctggtg 
ccaagcagat 
ttggcctcca 
actcgtatca 
tatgaactac 
ataacctaca 
gaaagggttg 
cacccatatg 
ccttcacact 
ctaacttacc 
ccagcgtatt 
catgacagtg 
cttgccaagg 
gtccaagtat 
ccagtgaagt 
gccaaggtct 
catcattatg 
cgcaattgtt 
gcttggcagc 
aaatacacac 



agacaccagt 
aatgtgttca 
taaagataaa 
ctgtgttcca 
atggaaagaa 
actgtgccac 
tggtccggga 
tcttttatcg 
tagcaattcg 
ttgtagttca 
gaagaagtgg 
agttcgattt 
atcatgtttt 
agctctgcca 
atgctcacct 
ctgaaggaag 
ctgtacagat 
attctctgag 
caattgagta 
gatccttatg 
caatatgaaa 
ttgttttcat 
acaactatct 
aaaaa 



gtatgcggaa 
agtcaagaat 
tgttaaactc 
gcaaccagtg 
gccttctatt 
agtaagagtt 
acttcttatt 
ggatggtgtt 
agaagcctgc 
gaagagacat 
caatatccca 
ttacctctgt 
atgggatgat 
cacttacgta 
ggtagcattt 
tcacgtttca 
tcaccaagat 
aggaagtact 
aggacacctc 
ttaatacaag 
ccagccaact 
ttcttgtagt 
ttgcaaaaaa 



gtgaaacgtg 
gtaataaaaa 

ggagggatca 

atctttttgg 
gctgctgttg 
ca.ga.gacccc 
caattttata 
tcagaggggc 
atcagtttgg 
cacactcgat 
gctggaacaa 
agccatgctg 
aactgcttta 
cgctgtacac 
agagccagat 
ggacaaagca 
accttacgca 
gaaagatgaa 
cagccataca 
gaagattgtt 
gctttttgtg 
ctaacccttt 
aagtaaagaa 



taggagacac 
catctcctca 
ataatattct 
gagccgatgt 
taggtagtat 
gacaggagat 
agtcaactcg 
agtttaggca 
agaaagacta 
tattttgtgc 
cagttgatac 
gaatacaggg 
ctgcagatga 
gatctgtttc 
atcatcttgt 
atgggcgaga 
caatgtactt 
ttgacataca 
gaaaccaaca 
tacttcatca 
cggtctccta 
taatgccttt 
aaagtaaatg 



1980 
2040 
2100 
2160 
2220 
2280 
2340 
2400 
2460 
2520 
2580 
2640 
2700 
2760 
2820 
2880 
2940 
3000 
3060 
3120 
3180 
3240 
3300 
3325 



<210> 67 

<211> 8114 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> misc__f eature 
<223> Incyte ID No: 



2075194CB1 



<400> 67 

cgcctcttcc 

gagcgattcc 

gtcagcactt 

ataaacaaag 

gtatctgatg 

tcatcgaact 

aggaaattag 

acttctgaga 

tttctgagtg 

aattttagca 

gaaatggttt 

agttgtacca 

tcttcttgct 

acctgttgtc 

caagcacatg 

gaaaacttgt 

gctcgtggag 

atggcaacaa 

gattcaaaag 

aaacaaagtg 

tcacaggaag 

gaaaaccaga 

gaatctacaa 

cctgagcgaa 

ttgaagggcc 

actcagagga 

gtgctgaaac 

acccaggagg 



tcccaaaggg 
gtcgccaaac 
tattaatgat 
aagcacaaga 
tgatagcaag 
ctgatagtgt 
atgaggcaga 
gctcagtcac 
actgcacagt 
ctattgatgt 
cccttgatct 
ttgggaatgt 
ctgacttgga 
actgcagcca 
ggccacagaa 
tgaatgcaca 
gatttgtaca 
aaaatgttca 
gattacgaaa 
gtagtagcag 
tagagattgt 
gtagaaagct 
aaaataccct 
atattcttgt 
aggcaaagaa 
tgtatatgaa 
acctggaagc 
cagagcaggg 



aagacggtga 
aggttatgag 
ggatagtgag 
cttgacaaag 
tttccctgag 
tgttatagga 
gccccttaaa 
agaagggggt 
tggaggcaca 
tgtttctctg 
ggaaagagaa 
agatacagtt 
aaaacatgct 
caaagcagag 
ggtcttttcc 
ttatcttggc 
gatcttaaca 
ctcaaaacca 
tgtgggaagc 
tgagcttctt 
tgaagaacat 
agacacctta 
tcaggcagca 
gttgggtaat 
aaggtttaat 
acacttgaga 
gtgcagcagt 
ccaggggagt 



gccggaggag 
tgccagtgag 
aataaacccg 
ctttcatccc 
aattctatgg 
gaagacagaa 
tctggaaagc 
attgcattag 
tgtctcccaa 
aaaacagaca 
tctcctttcc 
ctcaaatgcc 
gagtctcaca 
agcagctcag 
tgtgatcttt 
aaaacacatc 
aaacaacctt 
agaacttcta 
acgtttaaag 
gttgaaatga 
gttacttccc 
gtaacctcag 
cacggtaaca 
agctttcgtc 
cttttaggaa 
acacagatga 
gtgcagagag 
gcccgtcctc 



tcagtcagag 
ccgccttaga 
aaaatgacga 

ataatgaaga 
gcaaaagagg 
ataaacatgc 
aaggtatttg 
atgaaacagg 
atgccctctc 
ctgaaaaaac 
ccccgaaaga 
agatctgtgg 
tgcagcagcc 
cactacatat 
gtggttttca 
tccgtcgtca 
ttcctaaaaa 
aatcaatagc 
atttcagagg 
tgccttccag 
ttggtctagc 
agggtctctt 
gtgtaacctc 
gacgaagcag 
ttaaaagagg 
aaacacacga 
tgtgtgtgac 
cggactccgg 



gggcgagcag 
tagaagcatc 
ggatgaaaag 
cggtgggcct 
tttttcagaa 
ttccaaacgc 
tagattagaa 
gaaggagacc 
cccttcttgc 
atctgctcag 
aattagtgtt 
gcatttgttt 
taaggaacat 
gcatatcaaa 
gtgttcagaa 
gaatctggct 
accacgtaca 
aaagaatagt 
aagtatttct 
aaatactttg 
tcagaatcct 
agagaaattg 
gaggccaaga 
cactttcacc 
tacaagtgaa 
tgcagaatca 
tacctcagaa 
gctgcattcc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 
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ctgacagtga agccagcttc tggctctcag 
gtagctacaa ataggacaga tttggaaatc 
aaattttact gccgtacttg tgacttctct 
ttgcacagta accagcatca gcaaactgct 
atatccttgg atgaaataaa tcttagagac 
ctttgcaccc cttgtaatct gttctttttg 
accgagaagc atattaattc attggttcaa 
ttggttttac agactttacc tttgagtact 
gatgactcag gaaaagcatc tcaggaagaa 
gaagtgaggc attccagtaa gcctcagttt 
tcttctactg ttctcacgag acatataaag 
tgtaaagctt gtaatcttta ttcattgagc 
agcaagcatc ttgaaaatgc taagaaaaat 
gaaagggtat gtataggtgc aaatgataaa 
aggattgaag gccatatagg tgtgcaatta 
ctggcgtctg aggaactgtc acagtctggt 
accactactc caaagagagg gagacctaaa 
ggccttttgg cctctagtat tacaaacttg 
cagtatagtt atttatgcaa agtgtgtaag 
cgtcattgtg ccaccaagaa acataaagga 
agttcagata tcattgttgg ccctgaaggg 
ggctcagcag tgaccatgtc agatgaacat 
gttttagaga agccagatcg tggaaactca 
tctctagatg gagaagttaa cagccatctt 
ccagaggact tcgcccagcc gggggatgtg 
gagaataagt gtttgcactg tgagtttagt 
gtaaaacgga aacatacaaa agagtttgag 
gtgactcgtc gcgagatgac caggcatgca 
tcttatctca actctgctaa tgtagaagct 
atgcctgaag aagagcatca acaaaattct 
tctgatactc ttaaatctag aaatgctgca 
ttagatatgt ctaaagtgct ctgcgctgct 
tctaatttca atgaagacca ttccttttgt 
aaagttagga aacctgagga gatgatgtca 
agcagatttc aaaatgaaaa ttcaggaagc 
aaccatgaga tatcgaatga tgcaggtgag 
aacgcaggag acggtggagg tgttgtcccc 
ggggagcgct cggctgaaag ccctgtgctc 
aatctggaga gcgggggtca gaacagagtt 
ggtgtccaag aagatcccgt tctggggaat 
acagaattta ttttggagga ggatggccca 
gtctatgaaa ctataattag tattgatgat 
tttgactcct ccataataag aataaagaac 
gaagagggct tgatagcaac gggagtgaga 
caaggtgtga aaaagaagaa atctgagggc 
tgtgatgatt gtggcttctt agcagatgga 
aagcatccta caaaagagaa gcacttccat 
gaaagcaacc ttcaccagca tctggctagt 
gtggaggagc ttccggaggg aggggccacc 
gattctgaac agaatttatt tctacatatt 
gtgaataagt atatagtgga agacactgag 
ggaaacgtct gcaagtattg tgggaagatg 
gcacacattc gcactcacac aggatcaaaa 
acagctcagc ttggagatgc cagaaaccat 
aagtgtcatg tctgtggggt tgcttttgta 
ggcaagcatg gagttggcac cccaaaagaa 
agtttcacag agaagtgggc cctgaacaac 
tttaaatgta cctggcccac gtgccattac 
cactacagga cgcacacagg cgagaagtcg 
gggacccgcc acgccctcac caagcatcgc 
tgcgatgagt gtaactttgc ctccacaact 
cacactggag aaaagcccta cagatgcccc 
aatatccgca aacacattct gcatactggc 



acgttgtgtg cttgtacaga ctgtgggcaa 1740 
catgtgaaaa ggtgccatgc cagagagatg 1800 
agtatgtcaa gaagggactt agatgaacat 1860 
tctgtcctga gttgtcagtg ttgttcattt 1920 
cacatgaagg aaaagcacaa tatgcatttt 1980 
tctgaaaaag atgtggaaga acacaaagcc 2040 
ccaaagactt tgcaatcatc taacagtgat 2100 
ttagaatcag aaaacgcaaa agagtctatg 2160 
cctctgaagt ccagggtaag ccatggtaat 222 0 
cagtgtaaga agtgttttta taaaacaaga 2280 
cttcggcatg gtcaagacta tcattttctt 2340 
aaagaaggaa tggagaaaca cattaaaaga 2400 
aatattggct taagctttga agaatgtatt 2460 
aaagaagagt ttgatgtttc cggaaatgga 2520 
caagagcatt cctatcttga gaagggcatg 2580 
ggtagcacca aagatgatga attagcttca 2 640 
ggtaacatct cacggacgtg ttcacactgt 2700 
actgttcaca ttagacgaaa acacagtcac 2760 
tattacactg taactaaggg agatatggaa 2820 
cgggtagaaa tagaagcaag tggaaaacac 2880 
ggtagccttg aagctggtaa aaagaatgct 2940 
gctaacaaac cagctgagtc acccacctcc 3 000 
attgaagctg aagttgaaaa tgtatttcat 3 060 
cttgataaaa aggagcaaat atcttcagag 312 0 
tactcccaga gagatgttac aggcacaggt 3180 
gctcactcct ctgcttctct agagctgcat 3240 
ttttattgca tggcatgcga ttactacgcg 3300 
gcaacagaga agcacaaaat gaaaaggcag 3360 
ggttctgcag acatgtccaa aaacatcatt 3420 
gaggaatttc aaataatttc aggtcaacca 3 480 
gattgctcta ttttaaatga gaatactaat 3540 
gactctgtag aagttgagac tgaagaagaa 3 600 
gagactttcc aacaggctcc tgtcaaggat 3 660 
cttactatgt cctcaaacta tggctcccca 3720 
tctgccttaa attgtgagac agcaaagaaa 3780 
ctgcgtgtcc attgtgaggg tgaaggagga 3840 
cacagacacc tgtgccctgt gacgctcgat 3900 
gttgtgacaa gaataaccag agaacaggga 3960 
gcacgtgggc atggtttgga agacttgaaa 402 0 
aaggaaattc tgatgaattc acaacatgaa 4080 
gcttctgata gcacagttga aagtagtgat 4140 
aaagggcagg ccatgtacag ttttggtcga 4200 
cctgaagatg gtgagttgat agaccagtct 4260 
attagtgagc tgcccttgaa agactgtgct 4320 
agttccattg gtgagtctac acgaattcgc 43 80 
ctgagtggac tgaatgttca catagccatg 4440 
tgtttactct gtggaaagtc gttctatacc 4500 
gccggccaca tgagaaatga gcaggccagt 45 60 
tttaaatgtg tcaagtgcac agagcccttt 462 0 
aaaggacagc atgaggaatt gctgcgggag 4680 
caaatcaacc gcgagaggga ggaaaaccag 4740 
tgtcgaagca gcaactcgat ggccttcctg 4800 
ccattcaagt gcaagatatg ccattttgca 4860 
gtgaaaaggc accttgggat gagggaatac 4920 
atgaagaagc acttaaatac tcatctacta 49 80 
aggaaattta catgccactt atgtgataga 5040 
cacatgaaac tccacacggg agaaaagccg 510 0 
tcattcctca cagcctccgc aatgaaagac 5160 
tttctgtgtg acctctgcgg ctttgccggc 5220 
agacagcaca caggagaaaa acctttcaag 5280 
cagtcccatt tgactcggca taaacgtgtc 5340 
tggtgtgact acaggtcaaa ctgtgctgaa 5400 
aaacatgaag gagtcaagat gtacaactgt 5460 
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cccaagtgtg actacgggac caacgtcccg gtggagttcc ggaaccattt gaaggaacag 5520 
catcctgaca tcgaaaaccc ggacctcgct tacctgcatg ctggcattgt ttccaagtcg 5580 
tacgagtgcc gtctaaaggg acaaggagcc accttcgtgg agacagacag ccccttcacc 5640 
gcggcggcct tggcagaaga gcccctcgtc aaggagaagc ccctcagaag cagcaggagg 5700 
ccagcgccgc cccctgagca ggtgcagcag gtcatcatct tccagggcta cgacggggag 5760 
tttgccctgg acccctcggt ggaggagacg gccgccgcca cgctgcagac gctggccatg 5820 
gccggccagg tggcccgggt ggtgcatatc acggaggatg gccaggtcat cgccacgagt 5 880 
cagagcgggg cacatgtagg cagcgtggtg cccggaccca tcctccccga gcagctggct 5940 
gatggagcca cccaggtggt cgtcgtgggg ggctccatgg aaggccacgg catggatgag 6000 
tccctcagtc caggtggcgc tgtgatacaa caggtgacca agcaggagat tttaaacctc 6060 
tcggaggctg gagtcgctcc ccccgaggca tcctcagccc tggatgcatt gctctgtgcg 6120 
gtcactgaat taggggaggt ggagggcagg gctgggctcg aggagcaagg caggcccggc 6180 
gccaaagacg tgctgatcca gctgcccggg caggaggtct cccatgtggc tgccgacccc 6240 
gaggcccccg agatccagat gttcccacag gcccaggaga gcccggccgc cgtggaggtg 63 00 
ctcacccagg tggtccatcc ctcagcagcc atggcctctc aggagcgggc acaggtggcc 63 60 
ttcaagaaga tggtccaggg cgtcctccag tttgctgtgt gtgacacggc cgcggccggc 6420 
cagttggtca aggacggtgt cacccaggtg gtggtgagcg aagagggtgc cgtccacatg 6480 
gtcgccgggg agggtgccca gatcatcatg caggaggcgc agggcgagca catggatctg 6540 
gtggagtccg acggggagat ctcgcagatc atcgtgacag aggagctggt ccaggccatg 6600 
gtgcaggagt ccagtggcgg cttctccgag ggcaccacgc actacatcct gacagagctg 6660 
cccccagggg tgcaggacga gccgggcctg tactcccaca ccgtgctgga gactgcggac 6720 
tcgcaggaac tcctgcaggc cggggccacg ctaggcacag aggccggggc cccaagcagg 6780 
gcagagcagc tggccagcgt ggtcatctac acccaggagg gctcctcggc cgcggcggca 6 840 
attcagagcc aaagagaaag cagcgaactc caggaagcat gagacgcgcg gcacctttac 6900 
tcagcacagg gcaggtgtgg gaaggtccag cttcggtggg ggaccgtgtt ccctgagctt 6960 
catctgaaac cttcaaaacc atgaggacaa ggctcccgtg agctctgagc atgccctccc 7020 
agcgagagtc acactggcca ccagccaggc gcccacagag ggtaccgtgg gctgggcctc 7080 
ergggagcagg ctgccaagtg caggggaggg ccgggcgcag gccgcacagg gagctccggt 7140 
ccactggggg cccttcgatc agtggcctcc cgcttcgtcc tggccgctgt gctgaagaga 7200 
agccaagtgt tgttggtgtt tttctctccc aagtgttttc ccatttcagt tatcagaagg 72 60 
tcatggccgt ggggaaagtg gtgaagatac ccctcctggc ttggggtgca cctgcttgtg 7320 
cagtcagcat gtagctgcct ttccatttca ttctctactg ggctaaaaat tgcagctaca 7380 
agtgttacca tcttgaagca gtccacttcc attcaatttt ttttttttaa ttttagaata 7440 
acagtgtccc cataccaaag gaagcctgct agctcatttc atgtataaat ttcccatctt 7500 
caaacagttt aggtgtattt gttgctctgg tcacattctg cataaaagaa atcctcttaa 7560 
gcctatggtt aagaaaagcc ttgaagttta tattcagtta aaatatatgt cggtggagat 7620 
agccagtgct tctaattttg acttagtttc atacagtaaa gcctaaatgt gaaacgcaca 7680 
cgctggaaga tattgttcct atcaatattt tgctttttat aacaagggtt tgttcatatt 7740 
gatgccattt ttgcaggatt tcttcgtgat ttctgtccat atgaaaatgc tgacattaaa 7 800 
cattaacaca tggagaccgt gccctgtggc cctgccgtgg ctgccagcat ggtctgtgtt 7860 
tccttgtgga ttcacctgtg gccctgctgt ggccaccagc atggtctgtg tcctcgtgga 7920 
ttcactgcag ctgtcggatg cgagtttctg tcataatcat ttgtttcctg atacaattgt 7980 
tcttattctt ttccaaaact gtaaaataat ctcctccctc aaatgcaaag gttgfcttttg 8040 
ttctgtttct gttttctttg aaataaaatt ataacgttaa aagataaaaa aaaaaaaaaa 8100 
aaaaactgcg tgtc 8114 

<210> 68 

<211> 1530 

<212> DNA 

<213> Homo sapiens 

<220> 

<221> inisc_f eature 

<223> Incyte ID No: 2801633CB1 

<400> 68 

ggcgcggctt ttgcttgtag ctccagccag agctcggtta gggcctcatc gctctgctcc 6 0 
cgctccttag ggaagcctcg gtgattctgc cacagcctca gcctctgtgg ctctgtgacc 120 
tgccggtatt ggatgattcg tatctaagac tctgggacac tcctgaagtc gggaaatgga 180 
actcttaaca ttcaaggatg tggccataga attctctcca gaagagtgga aatgtctgga 240 
catttcccag cagaatttat atagagatgt gatgttggag aactacagaa acctggtctc 3 00 
cctgggtgtt actatctcta acccagacct ggtcaccagt ctggagcaaa gaaaagagcc 3 60 
ctacaatttg aagatacatg aaacagcagc cagaccccca gctgtgtgtt ctcatttcac 420 
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ccaaaacctt tggacagtgc agggcataga agattcattc cacaaactta taccaaaagg 480 
acatgagaaa cgtggacatg agaatttaag aaaaacttgt aaaagtataa atgagtgtaa 540 
ggtgcagaaa ggtggttata atagaattaa ccaatgctta ttaactaccc agaaaaaaac 600 
aattcaatct aatatatgtg tcaaagtttt tcataaattt tcaaattcaa acaaagataa 660 
gataagatat actggagata aaacctttaa atgtaaagaa tgtggcaaat catttcacgt 720 
gctctcacgc ctaactcaac acaaaagaat tcatactgga gagaacccct acacatgtga 780 
agaatgtggc aaagccttta attggtcctc aattcttact aaacataaga gaattcatgc 840 
cagagagaaa ttctacaagt gtgaagaatg tggtaaaggc tttactcggt cctcacacct 900 
tactaaacat aagagaattc atactggaga gaaactctac acatgaaaaa attgacaaag 960 
cttttaaccg caactcaatc tgttctaaac ataagagaaa tggtattggt gagaagccat 102 0 
aaaaatatga aaaatgtgga aaacccttca aatgcttgtc acatcttact gaatataatt 1080 
cttactgcag aaaaccccta ggaatattaa aagtgtggca aaacctttaa ccaatgctca 1140 
tatctctttg cacatgatag catttatact tgagaaaaat tgtacaaata tagaaaatgt 1200 
agaaaagcca ttaatgccta ctcatgtgtt actaaatatc agagagtttg tacttaataa 1260 
aaggattata aatgtagtat ttgttgaaag acctattaga aaatacaggt cttaaaagtg 1320 
aagaagagta ttctgaagat agacaataga aatagtaaga gggttgtagt acctgtactt 13 80 
gcatcatggg tcttattgtg catatttcat actagaagaa aaccctgaag cagttgccca 1440 
aactttcttg acattagaga atttatatcg gaaagaaatt ttacaaatgt gataaatttg 1500 
gaaaagcaaa aacaaaaaac aaaaaaaaaa 1530 



<210> 69 
<211> 2026 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> misc_f eature 

<223> Incyte ID No: 7493525CB1 



<400> 69 

gctccaggtc 

ccctgtgacc 

tagaaatgga 

gggctgagag 

atgtggccat 

tatatatgaa 

tctctaagca 

gacatgagat 

cagagcaaga 

aacatgagaa 

aagaaggtta 

gtgataaata 

atactggaga 

acctaagtca 

gcaaagcttt 

aacccttcaa 

ataagatgat 

accattcttc 

gtgaagaatg 

atgttaaaga 

accttactaa 

gcaaaggctt 

aaccctacaa 

ataagatgat 

accactcctc 

gtgaagaatg 

atactggaga 

accttactac 

gcaaagcttt 

aatcttacaa 

ataggaaaat 

taggtgagca 

catctctact 



tccccttcgc 
tgcaggtatt 
cgacttgaaa 
gaatcttcta 
agaattctct 
tgtgatgtta 
agacccagtc 
ggtggatgaa 
cataaaagat 
tttacagtta 
taatgagcta 
tgtgaaagtc 
gaaacctttc 
acataaaaga 
taaatggttc 
atgtgaagaa 
tcatactgga 
acaccttact 
tggtaaagct 
aaaaccctac 
acataagata 
taactggtct 
atgtgaagaa 
tcatactgga 
aaaacttact 
tggcaaagct 
gaaactctac 
acataagaga 
caaccgatcc 
atgtgaagaa 
tcagcagggc 
gatcgcgagg 
aaaaatacaa 



tgctctgtgt 
gggagaccca 
tatggagtgt 
gtttactctt 
ctggaggagt 
gaaaactaca 
acctgtctgg 
cccccagcta 
tcttttcaac 
agaaaaggct 
aaccagtgtt 
tttcataaat 
aaatgtaaaa 
attcatatta 
tcaaccctta 
tgtggcaaag 
gagaaaccct 
acacataagg 
tttaaccacc 
aaatgtgaag 
attcattctg 
tcaaccctta 
tgtggcaaag 
gagaaaccct 
atacataaga 
tttaaccaat 
aaatgtgaag 
attcacactg 
tcaaacctta 
tgtggtaaag 
atggtggctc 
tcaggagttc 
aaatttgctg 



cctctgctcc 
cagctaagac 
atcctctcaa 
attttgaaaa 
gggaatgcct 
aaaacctggt 
agcaagaaaa 
tgtgttctta 
aagtaatact 
ccgcaagtgt 
tgacaactac 
ttttaaatgc 
aatgtgatga 
gagagaattc 
ctagacacaa 
cttttaagca 
acagatgtga 
taattcatac 
cttcagccct 
aatgtgacaa 
gagagaaatc 
caaaacatag 
cctttaatgt 
acaaatgtga 
taattcatac 
cctcaaacct 
aatgtggcaa 
gagagaaacc 
ctaaacataa 
cctttaacca 
atgcctgtaa 
aagaccagcc 
ggtgtggtgg 



tagaggccca 
accgggaccc 
ggaagcaagt 
ggagacattg 
gaaccctgct 
cttcttggca 
agagccctgg 
ttttaccaaa 
gagaagatat 
agatgagtat 
ccagagcaaa 
aaatagacac 
atcattttgc 
ttaccaatgt 
gagaattcat 
ctcctcaacc 
agaatgtggc 
tggagagaag 
tactacacat 
agcttttaac 
ttacaaatgt 
aagaattcat 
gtcttcacac 
agaatgtggc 
tggagagaaa 
taccaaacat 
agcttttaac 
ctacaaatgt 
cataattcat 
atcctcaact 
tcccaacact 
tggccaacat 
caggcgcctg 



acatctgtgg 
Gctgaaagcc 
ggatgccctg 
acatttaggg 
cagcagaatt 
ggtgttgctg 
aatatgaaga 
gacctttggc 
ggcaaatgtg 
aaggtgcaca 
atatttccat 
aagacaagac 
atgcttttac 
gaagaatgtg 
actggagaga 
cttactacac 
aaagccttct 
cccttcaaat 
aagttcattc 
cgattctcat 
gaacaatgtg 
actggagaga 
cttactacac 
aaagccttta 
ccttacaaat 
aagataattc 
cgatcctcaa 
gaagaatgtg 
actggagaga 
cttactaaac 
ttgagaggac 
ggtaaaaccc 
taatcccagc 



60 

120 

180 

240 

300 

360 

420 

480 

540 

600 

660 

720 

780 

840 

900 

960 

1020 

1080 

1140 

1200 

1260 

1320 

1380 

1440 

1500 

1560 

1620 

1680 

1740 

1800 

1860 

1920 

1980 
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tacttggctc actgcaggct ccgccttccg ggttcacgcc afctctc 2026 

<210> 70 

<211> 1724 

<212> DNA 

<213> Homo sapiens 



<220> 

<221> mis cofeature 

<223> Incyte ID No: 7021892CB1 



<400> 70 

atgctgttgg 

tctatgtcga 

gcagccatga 

ggtaaaaacc 

tggattaaaa 

agcccctggc 

aggggagaaa 

agatgtcctg 

tgctgcctcc 

cgtttctgct 

ctggtttcca 

aggatgagga 

atcatttctg 

caagctgaga 

cgccattact 

gaatctgtca 

ggttgcagag 

agtccccagt 

tacaatgtta 

tggcgtccat 

tgttctgtga 

aacatttgaa 

ttttccgcta 

accatgaaca 

atgtcgtttt 

aaggggcgcg 

gtttcctatg 

caccacccat 

cggagccgcg 



tggccaccat 
ataatgaacg 
ctcttcacgt 

tgtccaatag 
aggagggaaa 
aggcagctct 
ctccagaagg 
tctgtctaaa 
agtgcctcaa 
ctgtggtctc 
tcatcaagga 
agtttcaagt 
aagacctgag 
ggttcgacac 
gggaggtgga 
accgacaggg 
aaggaaaggt 
tgcacagagt 
gtgatgggtg 
tttttgctca 
tcaatccatc 
cataatcatc 
gatacacata 
gaaagcaatt 
ctctgggctt 
cgattatgag 
gggggcggat 
acagcacaca 
acaccgctgg 



attggacagc 
atattcttca 
ctgtacccgt 

taacaatctc 
gggtgtggct 
cactccagat 
agagacattt 
agatcttgaa 
ttcactccag 
tcagaaggat 
actagagccc 
ggatatgacg 
gagtttccga 
tgccctgtgc 
cgtgggcacc 
gaagattgtg 
ctttgctgcc 

ccatatctac 
taaacgtgga 
cgctgccagt 
tttaggaagt 
ggtacaaagg 
atattaatga 
atgttatatt 
ctcgtcgacc 
tagacggcgt 
ccttggaagg 
cgaagcaggc 



ccgggtctag 
ttccagctag 
attgcctggt 
aatgatggca 
aaggtgggtg 
ctgagctgtc 
gccatggctg 
gaagccgtgc 
aaggagcccg 
gacatcaagc 
aagctgaaat 
ttcgatgtgg 
agtggggatt 
gtcctgggca 
agccaagtgt 
ctttcttcag 
agcactgtgc 
ctggatgtag 
acattcatcg 
agtcaagatg 
gccccagttt 
ttcagtgctc 
gacagaaggg 
ggggaaaatt 
tctgttcata 
gggattatcc 
aacaggctag 
ggggccaaac 
acgcacacca 



acccttttta 

catgtgaaat 
acaagggata 
gaatgaaatc 
gagacaccct 
cgcagaagca 
agcacttcaa 
aactgaaatg 
atggggaagg 
ccaagtacaa 
ctgttctaac 
acacagccaa 
tgagccagaa 
cccctcgctt 
gggatgtggg 
aacacggctt 
ctatgactcc 
gtatgaggtc 
agattcctgt 
atcagagcat 
cttctgaggg 
ccatagccat 
aaagagccat 
acattgtact 
aatatttgca 
ggccggtact 
ctgtcccggt 
gcgcacacga 
acgg 



ccacttcaac 
tcagaagtca 
tcacatagta 

agaatctgat 
ctggtataaa 
gctggaggcc 
acagatcatt 
tggatatgcc 
tttactgtgc 
gctgagggcg 
aatgaaccca 
caactatctc 
taggaaggag 
cacttccggc 
cgtgtgcaag 
cttgactgtg 
tctctgggtg 
cattgccttt 
ttgcgagccc 
cctgagtatc 
aaagtaaata 
agctaagaac 
cactctgtaa 
taaagttgtc 
aaaaaaacaa 

agggcgtaca 

gaatgtccgc 
aagcgaggcg 
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